一聚教程网:一个值得你收藏的教程网站

热门教程

neo4j 底层存储结构分析

时间:2022-06-29 09:58:15 编辑:袖梨 来源:一聚教程网

1       neo4j 中节点和关系的物理存储模型
1.1  neo4j存储模型


    The node records contain only a pointer to their first property and their first relationship (in what is oftentermed the _relationship chain). From here, we can follow the (doubly) linked-list of relationships until we find the one we’re interested in, the  LIKES relationship from  Node 1 to  Node 2 in this case. Once we’ve found the relationship record of interest, we can simply read its properties if there are any via the same singly-linked list structure as node properties, or we can examine the node records that it relates via its start node and end node IDs. These IDs, multiplied by the node record size, of course give the immediate offset of both nodes in the node store file.

上面的英文摘自(作者:IanRobinson) 一书,描述了 neo4j 的存储模型。Node和Relationship 的 Property 是用一个 Key-Value 的双向列表来保存的; Node 的 Relatsionship 是用一个双向列表来保存的,通过关系,可以方便的找到关系的 from-to Node. Node 节点保存第1个属性和第1个关系ID。

通过上述存储模型,从一个Node-A开始,可以方便的遍历以该Node-A为起点的图。下面给个示例,来帮助理解上面的存储模型,存储文件的具体格式在第2章详细描述。

1.2  示例1


在这个例子中,A~E表示Node 的编号,R1~R7 表示 Relationship 编号,P1~P10 表示Property 的编号。

Node 的存储示例图如下,每个Node 保存了第1个Property 和 第1个Relationship:


关系的存储示意图如下:


从示意图可以看出,从 Node-B 开始,可以通过关系的 next 指针,遍历Node-B 的所有关系,然后可以到达与其有关系的第1层Nodes,在通过遍历第1层Nodes的关系,可以达到第2层Nodes,…

2       neo4j graph db的存储文件介绍
当我们下载neo4j-community-2.1.0-M01 并安装,然后拿 neo4j embedded-example 的EmbeddedNeo4j 例子跑一下,可以看到在target/neo4j-hello-db下会生成如下neo4j graph db 的存储文件。

-rw-r–r–     11 04-11 13:28 active_tx_log

drwxr-xr-x   4096 04-11 13:28 index

-rw-r–r–  23740 04-11 13:28 messages.log

-rw-r–r–     78 04-11 13:28 neostore

-rw-r–r–      9 04-11 13:28 neostore.id

-rw-r–r–     22 04-11 13:28 neostore.labeltokenstore.db

-rw-r–r–      9 04-11 13:28 neostore.labeltokenstore.db.id

-rw-r–r–     64 04-11 13:28 neostore.labeltokenstore.db.names

-rw-r–r–      9 04-11 13:28 neostore.labeltokenstore.db.names.id

-rw-r–r–     61 04-11 13:28 neostore.nodestore.db

-rw-r–r–      9 04-11 13:28 neostore.nodestore.db.id

-rw-r–r–     93 04-11 13:28 neostore.nodestore.db.labels

-rw-r–r–      9 04-11 13:28 neostore.nodestore.db.labels.id

-rw-r–r–    307 04-11 13:28 neostore.propertystore.db

-rw-r–r–    153 04-11 13:28 neostore.propertystore.db.arrays

-rw-r–r–      9 04-11 13:28 neostore.propertystore.db.arrays.id

-rw-r–r–      9 04-11 13:28 neostore.propertystore.db.id

-rw-r–r–     61 04-11 13:28 neostore.propertystore.db.index

-rw-r–r–      9 04-11 13:28 neostore.propertystore.db.index.id

-rw-r–r–    216 04-11 13:28 neostore.propertystore.db.index.keys

-rw-r–r–      9 04-11 13:28 neostore.propertystore.db.index.keys.id

-rw-r–r–    410 04-11 13:28 neostore.propertystore.db.strings

-rw-r–r–      9 04-11 13:28 neostore.propertystore.db.strings.id

-rw-r–r–     69 04-11 13:28 neostore.relationshipgroupstore.db

-rw-r–r–      9 04-11 13:28 neostore.relationshipgroupstore.db.id

-rw-r–r–     92 04-11 13:28 neostore.relationshipstore.db

-rw-r–r–      9 04-11 13:28 neostore.relationshipstore.db.id

-rw-r–r–     38 04-11 13:28 neostore.relationshiptypestore.db

-rw-r–r–      9 04-11 13:28 neostore.relationshiptypestore.db.id

-rw-r–r–    140 04-11 13:28 neostore.relationshiptypestore.db.names

-rw-r–r–      9 04-11 13:28 neostore.relationshiptypestore.db.names.id

-rw-r–r–     82 04-11 13:28 neostore.schemastore.db

-rw-r–r–      9 04-11 13:28 neostore.schemastore.db.id

-rw-r–r–      4 04-11 13:28 nioneo_logical.log.active

-rw-r–r–   2249 04-11 13:28 nioneo_logical.log.v0

drwxr-xr-x   4096 04-11 13:28 schema

-rw-r–r–      0 04-11 13:28 store_lock

-rw-r–r–    800 04-11 13:28 tm_tx_log.1

2.1  存储 node 的文件
1)          存储节点数据及其序列Id

neostore.nodestore.db:  存储节点数组,数组的下标即是该节点的ID
neostore.nodestore.db.id  :存储最大的ID 及已经free的ID
2)          存储节点label及其序列Id

 neostore.nodestore.db.labels  :存储节点label数组数据,数组的下标即是该节点label的ID
neostore.nodestore.db.labels.id
2.2  存储 relationship 的文件
1)          存储关系数据及其序列Id

neostore.relationshipstore.db 存储关系 record 数组数据
neostore.relationshipstore.db.id
2)          存储关系组数据及其序列Id

neostore.relationshipgroupstore.db  存储关系 group数组数据
neostore.relationshipgroupstore.db.id
3)          存储关系类型及其序列Id

 neostore.relationshiptypestore.db  存储关系类型数组数据
 neostore.relationshiptypestore.db.id
4)          存储关系类型的名称及其序列Id

neostore.relationshiptypestore.db.names存储关系类型 token 数组数据
neostore.relationshiptypestore.db.names.id
2.3  存储 label 的文件
1)          存储label token数据及其序列Id

neostore.labeltokenstore.db  存储lable token 数组数据
neostore.labeltokenstore.db.id
2)          存储label token名字数据及其序列Id

neostore.labeltokenstore.db.names  存储 label token 的 names 数据
neostore.labeltokenstore.db.names.id
2.4  存储 property 的文件
1)          存储属性数据及其序列Id

neostore.propertystore.db  存储 property 数据
neostore.propertystore.db.id
2)          存储属性数据中的数组类型数据及其序列Id

neostore.propertystore.db.arrays  存储 property (key-value 结构)的Value值是数组的数据。
neostore.propertystore.db.arrays.id
3)          属性数据为长字符串类型的存储文件及其序列Id

neostore.propertystore.db.strings     存储 property (key-value 结构)的Value值是字符串的数据。
neostore.propertystore.db.strings.id
4)          属性数据的索引数据文件及其序列Id

neostore.propertystore.db.index       存储 property (key-value 结构)的key 的索引数据。
neostore.propertystore.db.index.id
5)          属性数据的键值数据存储文件及其序列Id

 neostore.propertystore.db.index.keys     存储 property (key-value 结构)的key 的字符串值。
neostore.propertystore.db.index.keys.id
2.5  其他的文件
1)          存储版本信息

 neostore
neostore.id
2)          存储 schema 数据

neostore.schemastore.db
 neostore.schemastore.db.id
3)          活动的逻辑日志

nioneo_logical.log.active
4)          记录当前活动的日志文件名称

 active_tx_log

3       neo4j存储结构
neo4j 中,主要有4类节点,属性,关系等文件是以数组作为核心存储结构;同时对节点,属性,关系等类型的每个数据项都会分配一个唯一的ID,在存储时以该ID 为数组的下标。这样,在访问时通过其ID作为下标,实现快速定位。所以在图遍历等操作时,可以实现 free-index。

3.1  neo4j 的 store 部分类图

neo4j-类图--store继承图

3.1.1   CommonAbstractStore.java
CommonAbstractStore 是所有 Store 类的基类,下面的代码片段是 CommonAbstractStore 的成员变量,比较重要的是飘红的几个,特别是IdGenerator,每种Store 的实例都有自己的 id 分配管理器; StoreChannel 是负责Store文件的读写和定位;WindowsPool 是与Store Record相关的缓存,用来提升性能的。

 代码如下 复制代码

public abstract class CommonAbstractStore implements IdSequence
 
{
 
public static abstract class Configuration
 
{
 
public static final Setting store_dir = InternalAbstractGraphDatabase.Configuration.store_dir;
 
public static final Setting neo_store = InternalAbstractGraphDatabase.Configuration.neo_store;
 
public static final Setting read_only = GraphDatabaseSettings.read_only;
 
public static final Setting backup_slave = GraphDatabaseSettings.backup_slave;
 
public static final Setting use_memory_mapped_buffers = GraphDatabaseSettings.use_memory_mapped_buffers;
 
}
 
public static final String ALL_STORES_VERSION = "v0.A.2";
 
public static final String UNKNOWN_VERSION = "Uknown";
 
protected Config configuration;
 
private final IdGeneratorFactory idGeneratorFactory;
 
private final WindowPoolFactory windowPoolFactory;
 
protected FileSystemAbstraction fileSystemAbstraction;
 
protected final File storageFileName;
 
protected final IdType idType;
 
protected StringLogger stringLogger;
 
private IdGenerator idGenerator = null;
 
private StoreChannel fileChannel = null;
 
private WindowPool windowPool;
 
private boolean storeOk = true;
 
private Throwable causeOfStoreNotOk;
 
private FileLock fileLock;
 
private boolean readOnly = false;
 
private boolean backupSlave = false;
 
private long highestUpdateRecordId = -1;

1.2  neo4j 的db文件及对应的存储格式类型

word-spacing: 0px; padding-top: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px" border="1" cellspacing="0" cellpadding="0" width="584">
文件名 文件存储格式
neostore.labeltokenstore.db LabelTokenStore(TokenStore)
neostore.labeltokenstore.db.id ID 类型
neostore.labeltokenstore.db.names StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30)
neostore.labeltokenstore.db.names.id ID 类型
neostore.nodestore.db NodeStore
neostore.nodestore.db.id ID 类型
neostore.nodestore.db.labels ArrayPropertyStore (AbstractDynamicStorelabel_block_size=60)
neostore.nodestore.db.labels.id ID 类型
neostore.propertystore.db PropertyStore
neostore.propertystore.db.arrays ArrayPropertyStore (AbstractDynamicStorearray_block_size=120)
neostore.propertystore.db.arrays.id ID 类型
neostore.propertystore.db.id ID 类型
neostore.propertystore.db.index PropertyIndexStore
neostore.propertystore.db.index.id ID 类型
neostore.propertystore.db.index.keys StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30)
neostore.propertystore.db.index.keys.id ID 类型
neostore.propertystore.db.strings StringPropertyStore (AbstractDynamicStorestring_block_size=120)
neostore.propertystore.db.strings.id ID 类型
neostore.relationshipgroupstore.db RelationshipGroupStore
neostore.relationshipgroupstore.db.id ID 类型
neostore.relationshipstore.db RelationshipStore
neostore.relationshipstore.db.id ID 类型
neostore.relationshiptypestore.db RelationshipTypeTokenStore(TokenStore)
neostore.relationshiptypestore.db.id ID 类型
neostore.relationshiptypestore.db.names StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30)
 neostore.relationshiptypestore.db.names.id ID 类型
neostore.schemastore.db SchemaStore(AbstractDynamicStore, BLOCK_SIZE = 56)
neostore.schemastore.db.id ID 类型

3.3  通用的Store 类型
3.3.1    id 类型
下面是 neo4j db 中,每种Store都有自己的ID文件(即后缀.id 文件),它们的格式都是一样的。

[test00]$ls -lh target/neo4j-test00.db/ |grep .id

-rw-r–r–9 04-11 13:28 neostore.id

-rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.id

-rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.names.id

-rw-r–r–9 04-11 13:28 neostore.nodestore.db.id

-rw-r–r–9 04-11 13:28 neostore.nodestore.db.labels.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.arrays.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.keys.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.strings.id

-rw-r–r–9 04-11 13:28 neostore.relationshipgroupstore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshipstore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.names.id

-rw-r–r–9 04-11 13:28 neostore.schemastore.db.id

3.3.1.1        ID类型文件的存储格式


neo4j 中后缀为 “.id”的文件格式如上图所示,由文件头(9 Bytes)和 long类型 数组 2部分构成:

sticky(1 byte) : if sticky the id generator wasn’t closed properly so it has to berebuilt (go through the node, relationship, property, rel type etc files).
nextFreeId(long) : 保存最大的ID,该值与对应类型的存储数组的数组大小相对应。
reuseId(long):用来保存已经释放且可复用的ID值。通过复用ID ,可以减少资源数组的空洞,提高磁盘利用率。
3.3.1.2        IdGeneratorImpl.java
每一种资源类型的ID 分配 neo4j 中是通过 IdGeneratorImpl 来实现的,其功能是负责ID管理分配和回收复用。对于节点,关系,属性等每一种资源类型,都可以生成一个IdGenerator  实例来负责其ID管理分配和回收复用。

3.3.1.2.1       读取id 文件进行初始化
下面试 IdGeneratorImpl.java 中, 读取id 文件进行初始化的过程,IdGeneratorImpl 会从 id 文件中读取grabSize 个可复用的ID (reuseId) 到idsReadFromFile(LinkedList) 中,在需要申请id 时优先分配 idsReadFromFile中的可复用ID。

 
// initialize the id generator and performs a simple validation
 
private synchronized void initGenerator()
 
{
 
try
 
{
 
fileChannel = fs.open( fileName, "rw" );
 
ByteBuffer buffer = ByteBuffer.allocate( HEADER_SIZE );
 
readHeader( buffer );
 
markAsSticky( buffer );
 
fileChannel.position( HEADER_SIZE );
 
maxReadPosition = fileChannel.size();
 
defraggedIdCount = (int) (maxReadPosition - HEADER_SIZE) / 8;
 
readIdBatch();
 
}
 
catch ( IOException e )
 
{
 
throw new UnderlyingStorageException(
 
"Unable to init id generator " + fileName, e );
 
}
 
}
 
private void readHeader( ByteBuffer buffer ) throws IOException
 
{
 
readPosition = fileChannel.read( buffer );
 
if ( readPosition != HEADER_SIZE )
 
{
 
fileChannel.close();
 
throw new InvalidIdGeneratorException(
 
"Unable to read header, bytes read: " + readPosition );
 
}
 
buffer.flip();
 
byte storageStatus = buffer.get();
 
if ( storageStatus != CLEAN_GENERATOR )
 
{
 
fileChannel.close();
 
throw new InvalidIdGeneratorException( "Sticky generator[ " +
 
fileName + "] delete this id file and build a new one" );
 
}
 
this.highId.set( buffer.getLong() );
 
}
 
private void readIdBatch()
 
{
 
if ( !canReadMoreIdBatches() )
 
return;
 
try
 
{
 
int howMuchToRead = (int) Math.min( grabSize*8, maxReadPosition-readPosition );
 
ByteBuffer readBuffer = ByteBuffer.allocate( howMuchToRead );
 
fileChannel.position( readPosition );
 
int bytesRead = fileChannel.read( readBuffer );
 
assert fileChannel.position() <= maxReadPosition;
 
readPosition += bytesRead;
 
readBuffer.flip();
 
assert (bytesRead % 8) == 0;
 
int idsRead = bytesRead / 8;
 
defraggedIdCount -= idsRead;
 
for ( int i = 0; i < idsRead; i++ )
 
{
 
long id = readBuffer.getLong();
 
if ( id != INTEGER_MINUS_ONE )
 
{
 
idsReadFromFile.add( id );
 
}
 
}
 
}
 
catch ( IOException e )
 
{
 
throw new UnderlyingStorageException(
 
"Failed reading defragged id batch", e );
 
}
 
}
3.3.1.2.2       释放id(freeId)
用户释放一个 id 后,会先放入 releasedIdList (LinkedList),当releasedIdList 中回收的 id 个数超过 grabSize 个时, 写入到 id 文件的末尾。所以可见,对于一个 IdGeneratorImpl, 最多有 2 * grabSize 个 id 缓存(releasedIdList 和 idsReadFromFile)。

 
/**
 
* Frees the id making it a defragged id that will be
 
* returned by next id before any new id (that hasn't been used yet) is
 
* returned.
 
*


 
* This method will throw an IOException if id is negative or
 
* if id is greater than the highest returned id. However as stated in the
 
* class documentation above the id isn't validated to see if it really is
 
* free.
 
*/
 
@Override
 
public synchronized void freeId( long id )
 
{
 
if ( id == INTEGER_MINUS_ONE )
 
{
 
return;
 
}
 
if ( fileChannel == null )
 
{
 
throw new IllegalStateException( "Generator closed " + fileName );
 
}
 
if ( id < 0 || id >= highId.get() )
 
{
 
throw new IllegalArgumentException( "Illegal id[" + id + "]" );
 
}
 
releasedIdList.add( id );
 
defraggedIdCount++;
 
if ( releasedIdList.size() >= grabSize )
 
{
 
writeIdBatch( ByteBuffer.allocate( grabSize*8 ) );
 
}
 
}
3.3.1.2.3       申请id ( nextId)
当用户申请一个 id  时,IdGeneratorImpl 在分配时,有2种分配策略: “正常的分配策略” 和激进分配策略”(aggressiveReuse),可以根据配置进行选择。

n  “正常的分配策略”:

a)        首先从idsReadFromFile 中分配; 如果 idsReadFromFile 为空,则先从对应的 id 文件中读取已释放且可复用的 id 到idsReadFromFile.

b)        如果 idsReadFromFile 及 id 文件中没有已释放且可复用的 id了,则分配全新的id,即id = highId.get()  并将highId 加1;

n   “激进分配策略”(aggressiveReuse):

a)        首先从releasedIdList(刚回收的ID List)中分配。

b)        releasedIdList分配光了,则从idsReadFromFile 中分配; 如果 idsReadFromFile 为空,则先从对应的 id 文件中读取已释放且可复用的 id 到idsReadFromFile.

c)        如果 idsReadFromFile 及 id 文件中没有已释放且可复用的 id了,则分配全新的id,即id = highId.get()  并将highId 加1;

 
/**
 
* Returns the next "free" id. If a defragged id exist it will be returned
 
* else the next free id that hasn't been used yet is returned. If no id
 
* exist the capacity is exceeded (all values <= max are taken) and a
 
* {@link UnderlyingStorageException} will be thrown.
 
*/
 
@Override
 
public synchronized long nextId()
 
{
 
assertStillOpen();
 
long nextDefragId = nextIdFromDefragList();
 
if ( nextDefragId != -1 ) return nextDefragId;
 
long id = highId.get();
 
if ( id == INTEGER_MINUS_ONE )
 
{
 
// Skip the integer -1 (0xFFFFFFFF) because it represents
 
// special values, f.ex. the end of a relationships/property chain.
 
id = highId.incrementAndGet();
 
}
 
assertIdWithinCapacity( id );
 
highId.incrementAndGet();
 
return id;
 
}

3.3.2   DynamicStore 类型
3.3.2.1        AbstractDynamicStore 的存储格式
neo4j 中对于字符串等变长值的保存策略是用一组定长的 block 来保存,block之间用单向链表链接。类 AbstractDynamicStore 实现了该功能,下面是其注释说明。

/**

 * An abstract representation of a dynamic store. The difference between a

 * normal AbstractStore and a AbstractDynamicStore is

 * that the size of a record/entry can be dynamic.

 * Instead of a fixed record this class uses blocks to store a record. If a

 * record size is greater than the block size the record will use one or more

 * blocks to store its data.

 * A dynamic store don’t have a IdGenerator because the position of a

 * record can’t be calculated just by knowing the id. Instead one should use a

 * AbstractStore and store the start block of the record located in the

 * dynamic store. Note: This class makes use of an id generator internally for

 * managing free and non free blocks.

 * Note, the first block of a dynamic store is reserved and contains information

 * about the store.

 */

 

AbstractDynamicStore 类对应的存储文件格式如上图所示, 整个文件是有一个block_size=BLOCK_HEADER_SIZE(8Bytes)+block_content_size的定长数组和一个字符串“StringPropertyStore v0.A.2”或“ArrayPropertyStore v0.A.2”或“SchemaStore v0.A.2”(文件类型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION构成)。访问时,可以通过 id 作为数组的下标进行访问。其中,文件的第1个 record 中前4 字节用来保存 block_size。文件的第2个 record开始保存实际的block数据,它由8个字节的block_header和定长的 block_content(可配置)构成. block_header 结构如下:

inUse(1 Byte):第1字节,共分成3部分
[x__ ,    ]  0: start record, 1: linked record

[   x,    ]  inUse

[    ,xxxx]  high next block bits

第1~4 bit 表示next_block 的高4位
第5 bit表示block 是否在 use;
第8 bit 表示 block 是否是单向链表的第1个 block;0 表示第1个block, 1表示后续 block.
nr_of_bytes(3Bytes):本 block 中保存的数据的长度。
next_block(4Bytes): next_block 的低 4 个字节,加上 inUse 的第1~4 位,next_block 的实际长度共 36 bit。以数组方式存储的单向链表的指针,指向保存同一条数据的下一个 block 的id.
3.3.2.2        AbstractDynamicStore.java
下面看一下 AbstractDynamicStore.java 中 getRecord() 和readAndVerifyBlockSize() 成员函数,可以帮助理解 DynamicStore 的存储格式。

getRecord( long blockId, PersistenceWindow window, RecordLoad load )
{
 
DynamicRecord record = new DynamicRecord( blockId );
 
Buffer buffer = window.getOffsettedBuffer( blockId );
 
/*
 
* First 4b
 
* [x   ,    ][    ,    ][    ,    ][    ,    ] 0: start record, 1: linked record
 
* [   x,    ][    ,    ][    ,    ][    ,    ] inUse
 
* [    ,xxxx][    ,    ][    ,    ][    ,    ] high next block bits
 
* [    ,    ][xxxx,xxxx][xxxx,xxxx][xxxx,xxxx] nr of bytes in the data field in this record
 
*
 
*/
 
long firstInteger = buffer.getUnsignedInt();
 
boolean isStartRecord = (firstInteger & 0x80000000) == 0;
 
long maskedInteger = firstInteger & ~0x80000000;
 
int highNibbleInMaskedInteger = (int) ( ( maskedInteger ) >> 28 );
 
boolean inUse = highNibbleInMaskedInteger == Record.IN_USE.intValue();
 
if ( !inUse && load != RecordLoad.FORCE )
 
{
 
throw new InvalidRecordException( "DynamicRecord Not in use, blockId[" + blockId + "]" );
 
}
 
int dataSize = getBlockSize() - BLOCK_HEADER_SIZE;
 
int nrOfBytes = (int) ( firstInteger & 0xFFFFFF );
 
/*
 
* Pointer to next block 4b (low bits of the pointer)
 
*/
 
long nextBlock = buffer.getUnsignedInt();
 
long nextModifier = ( firstInteger & 0xF000000L ) << 8;
 
long longNextBlock = longFromIntAndMod( nextBlock, nextModifier );
 
boolean readData = load != RecordLoad.CHECK;
 
if ( longNextBlock != Record.NO_NEXT_BLOCK.intValue()
 
&& nrOfBytes < dataSize || nrOfBytes > dataSize )
 
{
 
readData = false;
 
if ( load != RecordLoad.FORCE )
 
{
 
throw new InvalidRecordException( "Next block set[" + nextBlock
 
+ "] current block illegal size[" + nrOfBytes + "/" + dataSize + "]" );
 
}
 
}
 
record.setInUse( inUse );
 
record.setStartRecord( isStartRecord );
 
record.setLength( nrOfBytes );
 
record.setNextBlock( longNextBlock );
 
/*
 
* Data 'nrOfBytes' bytes
 
*/
 
if ( readData )
 
{
 
byte byteArrayElement[] = new byte[nrOfBytes];
 
buffer.get( byteArrayElement );
 
record.setData( byteArrayElement );
 
}
 
return record;
 
}
readAndVerifyBlockSize()
protected void readAndVerifyBlockSize() throws IOException
 
{
 
ByteBuffer buffer = ByteBuffer.allocate( 4 );
 
getFileChannel().position( 0 );
 
getFileChannel().read( buffer );
 
buffer.flip();
 
blockSize = buffer.getInt();
 
if ( blockSize <= 0 )
 
{
 
throw new InvalidRecordException( "Illegal block size: " +
 
blockSize + " in " + getStorageFileName() );
 
}
 
}
3.3.2.3        类DynamicArrayStore, DynamicStringStore
类SchemaStore,DynamicArrayStore(ArrayPropertyStore), DynamicStringStore(StringPropertyStore)都是继承成自类AbstractDynamicStore,所以与类DynamicArrayStore, DynamicStringStore和 SchemaStore对应文件的存储格式,都是遵循AbstractDynamicStore的存储格式,除了block块的大小(block_size)不同外。

db 文件 存储类型 block_size
neostore.labeltokenstore.db.names StringPropertyStore NAME_STORE_BLOCK_SIZE=30
neostore.propertystore.db.index.keys StringPropertyStore NAME_STORE_BLOCK_SIZE=30
neostore.relationshiptypestore.db.names StringPropertyStore NAME_STORE_BLOCK_SIZE=30
neostore.propertystore.db.strings StringPropertyStore string_block_size=120
neostore.nodestore.db.labels ArrayPropertyStore label_block_size=60
neostore.propertystore.db.arrays ArrayPropertyStore array_block_size=120
neostore.schemastore.db SchemaStore BLOCK_SIZE=56
block_size 通过配置文件或缺省值来设置的,下面的代码片段展示了neostore.propertystore.db.strings 文件的创建过程及block_size 的大小如何传入。

1)        GraphDatabaseSettings.java

public static final Setting string_block_size = setting("string_block_size", INTEGER, "120",min(1));
 
public static final Setting array_block_size = setting("array_block_size", INTEGER, "120",min(1));
 
public static final Setting label_block_size = setting("label_block_size", INTEGER, "60",min(1));


2)        StoreFactory.java的Configuration 类
public static abstract class Configuration
 
{
 
public static final Setting string_block_size = GraphDatabaseSettings.string_block_size;
 
public static final Setting array_block_size = GraphDatabaseSettings.array_block_size;
 
public static final Setting label_block_size = GraphDatabaseSettings.label_block_size;
 
public static final Setting dense_node_threshold = GraphDatabaseSettings.dense_node_threshold;
 
}
3)        StoreFactory.java的createPropertyStore 函数

public void createPropertyStore( File fileName )
 
{
 
createEmptyStore( fileName, buildTypeDescriptorAndVersion( PropertyStore.TYPE_DESCRIPTOR ));
 
int stringStoreBlockSize = config.get( Configuration.string_block_size );
 
int arrayStoreBlockSize = config.get( Configuration.array_block_size )
 
createDynamicStringStore(new File( fileName.getPath() + STRINGS_PART), stringStoreBlockSize, IdType.STRING_BLOCK);
 
createPropertyKeyTokenStore( new File( fileName.getPath() + INDEX_PART ) );
 
createDynamicArrayStore( new File( fileName.getPath() + ARRAYS_PART ), arrayStoreBlockSize );
 
}
4)        StoreFactory.java的createDynamicStringStore函数
private void createDynamicStringStore( File fileName, int blockSize, IdType idType )
 
{
 
createEmptyDynamicStore(fileName, blockSize, DynamicStringStore.VERSION, idType);
 
}
5)        StoreFactory.java的createEmptyDynamicStore 函数


/**
 
* Creates a new empty store. A factory method returning an implementation
 
* should make use of this method to initialize an empty store. Block size
 
* must be greater than zero. Not that the first block will be marked as
 
* reserved (contains info about the block size). There will be an overhead
 
* for each block of AbstractDynamicStore.BLOCK_HEADER_SIZEbytes.
 
*/
 
public void createEmptyDynamicStore( File fileName, int baseBlockSize,
 
String typeAndVersionDescriptor, IdType idType)
 
{
 
int blockSize = baseBlockSize;
 
// sanity checks
 

 
blockSize += AbstractDynamicStore.BLOCK_HEADER_SIZE;
 
// write the header
 
try
 
{
 
FileChannel channel = fileSystemAbstraction.create(fileName);
 
int endHeaderSize = blockSize
 
+ UTF8.encode( typeAndVersionDescriptor ).length;
 
ByteBuffer buffer = ByteBuffer.allocate( endHeaderSize );
 
buffer.putInt( blockSize );
 
buffer.position( endHeaderSize - typeAndVersionDescriptor.length() );
 
buffer.put( UTF8.encode( typeAndVersionDescriptor ) ).flip();
 
channel.write( buffer );
 
channel.force( false );
 
channel.close();
 
}
 
catch ( IOException e )
 
{
 
throw new UnderlyingStorageException( "Unable to create store "
 
+ fileName, e );
 
}
 
idGeneratorFactory.create( fileSystemAbstraction, new File( fileName.getPath() + ".id"), 0 );
 
// TODO highestIdInUse = 0 works now, but not when slave can create store files.
 
IdGenerator idGenerator = idGeneratorFactory.open(fileSystemAbstraction,
 
new File( fileName.getPath() + ".id"),idType.getGrabSize(), idType, 0 );
 
idGenerator.nextId(); // reserve first for blockSize
 
idGenerator.close();
 
}

热门栏目