AvocadoDBの概要説明とJavaからの扱い方

今回はAvocadoDBの簡単な概念とJavaからの扱い方を説明します。そういえばAvocadoDBはmrubyの人柱に名乗りを上げています。mrubyを使ってみる理由は彼らのBlogに書いてあります。前回の記事でマルチスレッドですと書きましたが、V8をマルチスレッドで扱うのめんどくね？とちょっと思っていたのですが、彼らも思っていたようですｗまたAvocadoDBのQueryLanguageはRFC化を目指しているようです。とっても意欲的ですね。

※Version0.3.12を元に記載していますので、今後変わる可能性があります。

さて、まずはAvocadoDBの用語説明をします。全部は解説しません。一部だけです。

用語	説明
Collection	RDBで言うところのテーブルです
Collection Identifier	コレクションを識別するための内部ID（整数型です）
Collection Name	コレクションを識別するための名前です（文字列です）。Collection Identifierと内部的に1：1に対応しています。命名規則は先頭が英字で他は英数かアンダーバー、ハイフンで構成されます。ハイフンを除けば常識的なプログラミング言語の変数名の規則と同じです。
Document	RDBで言うところのレコードです
Document Identifier	Documentを識別するための内部ID（整数型です）。これを直接使うことはほとんどありません。↓のDocumentHandleを通常は使います。
Document Handle	Documentを識別するためのHandle名（文字列です）。書式はCollectionID + "/" + DocumentIDとなります。

Document Revision	MVCCのMV(Multi Version)のバージョン番号にあたるものです。整数型です。これとDocument Etagを併せて説明しないといけないのですが、今回は省略します・・。

HTTP-RESTのInterfaceではCollectionNameもCollectionIDも両方サポートしています。またDocumentの実際の値はJSONになります。JSONObjectでも、JsonArrayでも大丈夫ですし、AvocadoDBはKVSでもあるのでJSONの要素（文字列、整数、Boolean、NULL）も受け付けます。

ちょっとだけGraphDBとしてのAvocadoDBも説明します。
Graph構造はVertexとEdgeで構成されます。VertexはDocumentに相当します。Edgeは専用で作るAPIが存在します。有向グラフか無向グラフ（要はエッジの向きがあるか無いか）はEdgeを作るときには指定できません。基本は有効グラフとしてEdgeを作ります。Edgeを取得する時にany, in, outを指定して取ります。VertexはDocumentなので任意の属性を保存する事ができ、Edgeも同様に任意の属性を保存する事ができます。

Indexの種類は今は3つあります。Hash Index, Geo Index, Skip Listsです。（B-treeが無い理由はVideo見ればわかったりするのかな？）。複合インデックスに出来ますが、Geo Indexは要素は2つまでの制約があります。Indexはユニーク制約を付けることができます。HashとSkipListsの特性の違いの説明は必要ありませんよね。

CollectionやDocumentを作るときに、Diskへの書き込みを同期化するかのオプションがあります。InnoDBのFlushパラメータみたいなのですが、これは基本的にAsyncにしないと性能が出ません。（※性能テストはRoadMapの1.0betaで行う事になっているので、今の段階で性能云々を言うのはちょっとお門違いです）

さて、次はJavaで扱う方法です。Javaのドライバはこちらにあります。JARファイル作っていません。作った方がいいかな？
※prototypeですので、サーバが落ちている時のハンドリングを入れていません。

今できる事は、HTTP-RESTを叩いてCollection, Document, Cursor, Index, Edgeの操作ができます。

使い方はREADMEかExampleに書いてあります。単体テストを見るのが良いと思います。
以下はDocumentを1000個適当に作ってクエリで検索するサンプルです。Cursorを扱いやすくするためにResultSetを作ってみました。

public class Example1 {

  public static class ExampleEntity {
    public String name;
    public String gender;
    public int age;
  }

  public static void main(String[] args) {

    AvocadoConfigure configure = new AvocadoConfigure();
    AvocadoDriver driver = new AvocadoDriver(configure);

    try {
      for (int i = 0; i < 1000; i++) {
        ExampleEntity value = new ExampleEntity();
        value.name = "TestUser" + i;
        switch (i % 3) {
        case 0: value.gender = "MAN"; break;
        case 1: value.gender = "WOMAN"; break;
        case 2: value.gender = "OTHER"; break;
        }
        value.age = (int) (Math.random() * 100) + 10;
        driver.createDocument("example_collection1", value, true, null, null);
      }

      HashMap<String, Object> bindVars = new HashMap<String, Object>();
      bindVars.put("gender", "WOMAN");

      CursorResultSet<ExampleEntity> rs = driver.executeQueryWithResultSet(
          "select t from example_collection1 t where t.age >= 20 && t.age < 30 && t.gender == @gender@",
          bindVars, ExampleEntity.class, true, 10);

      System.out.println(rs.getTotalCount());
      for (ExampleEntity obj: rs) {
        System.out.printf("  %15s(%5s): %d%n", obj.name, obj.gender, obj.age);
      }

    } catch (AvocadoException e) {
      e.printStackTrace();
    } finally {
      driver.shutdown();
    }

  }

}

グラフも扱えます。グラフの扱い方は以下のようになります。ただし効果的にトラバーサルする方法がまだ提供されていません。TinkerPopのBluePrintsとGremlinに対応するらしいので、そのあたりで出来ることは将来的にできるようになると思います。（私はGremlinでGraphを操作した事がないので詳細はわかりません。）
以下のサンプルは、Vertexを10個作って、エッジを3つ(e1=v0->v1, e2=v0->v2, e3=v2->v3)作り、v0のエッジを求めています。

public class Example2 {

  public static class TestEdgeAttribute {
    public String a;
    public int b;
    public TestEdgeAttribute(){}
    public TestEdgeAttribute(String a, int b) {
      this.a = a;
      this.b = b;
    }
  }
  public static class TestVertex {
    public String name;
  }

  public static void main(String[] args) {

    AvocadoConfigure configure = new AvocadoConfigure();
    AvocadoDriver driver = new AvocadoDriver(configure);

    final String collectionName = "example";
    try {

      // CreateVertex
      ArrayList<DocumentEntity<TestVertex>> docs = new ArrayList<DocumentEntity<TestVertex>>();
      for (int i = 0; i < 10; i++) {
        TestVertex value = new TestVertex();
        value.name = "vvv" + i;
        DocumentEntity<TestVertex> doc = driver.createDocument(collectionName, value, true, false, null);
        docs.add(doc);
      }

      // 0 -> 1
      // 0 -> 2
      // 2 -> 3

      EdgeEntity<TestEdgeAttribute> edge1 = driver.createEdge(
          collectionName, docs.get(0).getDocumentHandle(), docs.get(1).getDocumentHandle(),
          new TestEdgeAttribute("edge1", 100));

      EdgeEntity<TestEdgeAttribute> edge2 = driver.createEdge(
          collectionName, docs.get(0).getDocumentHandle(), docs.get(2).getDocumentHandle(),
          new TestEdgeAttribute("edge2", 200));

      EdgeEntity<TestEdgeAttribute> edge3 = driver.createEdge(
          collectionName, docs.get(2).getDocumentHandle(), docs.get(3).getDocumentHandle(),
          new TestEdgeAttribute("edge3", 300));

      EdgesEntity<TestEdgeAttribute> edges = driver.getEdges(collectionName, docs.get(0).getDocumentHandle(), Direction.ANY, TestEdgeAttribute.class);
      System.out.println(edges.size());
      System.out.println(edges.get(0).getAttributes().a);
      System.out.println(edges.get(1).getAttributes().a);

    } catch (AvocadoException e) {
      e.printStackTrace();
    } finally {
      driver.shutdown();
    }

  }

}