大学生做静态网站,中国产品网注册,免费培训班报名官网,装修公司前十强有哪些neo4j cypher当心渴望的管道 尽管我喜欢Cypher的LOAD CSV命令使它容易地将数据获取到Neo4j中的方法#xff0c;但它目前打破了最不惊奇的规则#xff0c;因为它急切地在所有行中加载某些查询#xff0c;即使是那些使用定期提交的查询。 这是我的同事Michael在他的第二篇博… neo4j cypher 当心渴望的管道 尽管我喜欢Cypher的LOAD CSV命令使它容易地将数据获取到Neo4j中的方法但它目前打破了最不惊奇的规则因为它急切地在所有行中加载某些查询即使是那些使用定期提交的查询。 这是我的同事Michael在他的第二篇博客文章中指出的它解释了如何成功使用LOAD CSV 即使遵循我之前的建议人们遇到的最大问题是对于超过一百万行的大量导入Cypher遇到了内存不足的情况。 这与提交大小无关 因此即使是小批量的PERIODIC COMMIT也会发生。 最近我花了几天时间在具有4GB RAM的Windows机器上将数据导入Neo4j所以我比Michael建议的更早看到了这个问题。 Michael解释了如何确定您的查询是否遭受意外的急切评估 如果分析该查询则会看到查询计划中有一个“急切”步骤。 那就是“拉入所有数据”的地方。 您可以通过在单词“ PROFILE”前面加上前缀来配置查询。 您需要在Web浏览器的/ webadmin控制台中或使用Neo4j shell运行查询。 我为查询执行了此操作并且能够识别得到快速评估的查询模式在某些情况下我们可以解决该问题。 我们将使用Northwind数据集来演示Eager管道如何潜入我们的查询中但请记住该数据集足够小不会引起问题。 这是文件中的行的样子 $ head -n 2 data/customerDb.csv
OrderID,CustomerID,EmployeeID,OrderDate,RequiredDate,ShippedDate,ShipVia,Freight,ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry,CustomerID,CustomerCompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax,EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath,OrderID,ProductID,UnitPrice,Quantity,Discount,ProductID,ProductName,SupplierID,CategoryID,QuantityPerUnit,UnitPrice,UnitsInStock,UnitsOnOrder,ReorderLevel,Discontinued,SupplierID,SupplierCompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax,HomePage,CategoryID,CategoryName,Description,Picture
10248,VINET,5,1996-07-04,1996-08-01,1996-07-16,3,32.38,Vins et alcools Chevalier,59 rue de lAbbaye,Reims,,51100,France,VINET,Vins et alcools Chevalier,Paul Henriot,Accounting Manager,59 rue de lAbbaye,Reims,,51100,France,26.47.15.10,26.47.15.11,5,Buchanan,Steven,Sales Manager,Mr.,1955-03-04,1993-10-17,14 Garrett Hill,London,,SW1 8JR,UK,(71) 555-4848,3453,\x,Steven Buchanan graduated from St. Andrews University, Scotland, with a BSC degree in 1976. Upon joining the company as a sales representative in 1992, he spent 6 months in an orientation program at the Seattle office and then returned to his permanent post in London. He was promoted to sales manager in March 1993. Mr. Buchanan has completed the courses Successful Telemarketing and International Sales Management. He is fluent in French.,2,http://accweb/emmployees/buchanan.bmp,10248,11,14,12,0,11,Queso Cabrales,5,4,1 kg pkg.,21,22,30,30,0,5,Cooperativa de Quesos Las Cabras,Antonio del Valle Saavedra,Export Administrator,Calle del Rosal 4,Oviedo,Asturias,33007,Spain,(98) 598 76 54,,,4,Dairy Products,Cheeses,\x合并合并合并 我们要做的第一件事是为每个员工和每个订单创建一个节点然后在它们之间创建一个关系。 我们可以从以下查询开始 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MERGE (employee:Employee {employeeId: row.EmployeeID})
MERGE (order:Order {orderId: row.OrderID})
MERGE (employee)-[:SOLD]-(order) 这样就可以了但是如果我们像这样对查询进行概要分析…… PROFILE LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
WITH row LIMIT 0
MERGE (employee:Employee {employeeId: row.EmployeeID})
MERGE (order:Order {orderId: row.OrderID})
MERGE (employee)-[:SOLD]-(order) …我们会在第三行看到“渴望” ---------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |---------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph(0) | 0 | 0 | employee, order, UNNAMED216 | MergePattern || Eager | 0 | 0 | | || UpdateGraph(1) | 0 | 0 | employee, employee, order, order | MergeNode; :Employee; MergeNode; :Order || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |--------------------------------------------------------------------------------------------------------- 您会注意到当我们分析每个查询时我们将删除定期提交部分并添加“ WITH row LIMIT 0”。 这使我们能够生成足够的查询计划来标识“急切”运算符而无需实际导入任何数据。 我们希望将该查询分为两个查询以便可以不急于处理它 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
WITH row LIMIT 0
MERGE (employee:Employee {employeeId: row.EmployeeID})
MERGE (order:Order {orderId: row.OrderID}) ------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | employee, employee, order, order | MergeNode; :Employee; MergeNode; :Order || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------------------------------------ 现在我们已经创建了员工和订单我们可以将他们加入在一起 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MATCH (employee:Employee {employeeId: row.EmployeeID})
MATCH (order:Order {orderId: row.OrderID})
MERGE (employee)-[:SOLD]-(order) ------------------------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |------------------------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | employee, order, UNNAMED216 | MergePattern || Filter(0) | 0 | 0 | | Property(order,orderId) Property(row,OrderID) || NodeByLabel(0) | 0 | 0 | order, order | :Order || Filter(1) | 0 | 0 | | Property(employee,employeeId) Property(row,EmployeeID) || NodeByLabel(1) | 0 | 0 | employee, employee | :Employee || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------------------------------------------------------ 眼中没有渴望 比赛比赛比赛合并合并 如果我们快进几步我们现在可能已经将导入脚本重构到了我们在一个查询中创建节点并在另一个查询中创建关系的地步。 我们的create查询按预期工作 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MERGE (employee:Employee {employeeId: row.EmployeeID})
MERGE (order:Order {orderId: row.OrderID})
MERGE (product:Product {productId: row.ProductID}) ---------------------------------------------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |---------------------------------------------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | employee, employee, order, order, product, product | MergeNode; :Employee; MergeNode; :Order; MergeNode; :Product || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------------------------------------------------------------------------- 现在我们在图表中有了员工产品和订单。 现在让我们在三者之间建立关系 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MATCH (employee:Employee {employeeId: row.EmployeeID})
MATCH (order:Order {orderId: row.OrderID})
MATCH (product:Product {productId: row.ProductID})
MERGE (employee)-[:SOLD]-(order)
MERGE (order)-[:PRODUCT]-(product) 如果我们描述我们会发现Eager再次潜入了 ------------------------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |------------------------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph(0) | 0 | 0 | order, product, UNNAMED318 | MergePattern || Eager | 0 | 0 | | || UpdateGraph(1) | 0 | 0 | employee, order, UNNAMED287 | MergePattern || Filter(0) | 0 | 0 | | Property(product,productId) Property(row,ProductID) || NodeByLabel(0) | 0 | 0 | product, product | :Product || Filter(1) | 0 | 0 | | Property(order,orderId) Property(row,OrderID) || NodeByLabel(1) | 0 | 0 | order, order | :Order || Filter(2) | 0 | 0 | | Property(employee,employeeId) Property(row,EmployeeID) || NodeByLabel(2) | 0 | 0 | employee, employee | :Employee || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------------------------------------------------------ 在这种情况下Eager发生在我们第二次致电MERGE时正如Michael在他的帖子中指出的 问题是在单个Cypher语句中您必须隔离会进一步影响匹配的更改例如当您创建带有标签的节点时该标签突然被以后的MATCH或MERGE操作所匹配。 在这种情况下我们可以通过使用单独的查询来创建关系来解决该问题 LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MATCH (employee:Employee {employeeId: row.EmployeeID})
MATCH (order:Order {orderId: row.OrderID})
MERGE (employee)-[:SOLD]-(order) ------------------------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |------------------------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | employee, order, UNNAMED236 | MergePattern || Filter(0) | 0 | 0 | | Property(order,orderId) Property(row,OrderID) || NodeByLabel(0) | 0 | 0 | order, order | :Order || Filter(1) | 0 | 0 | | Property(employee,employeeId) Property(row,EmployeeID) || NodeByLabel(1) | 0 | 0 | employee, employee | :Employee || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------------------------------------------------------USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MATCH (order:Order {orderId: row.OrderID})
MATCH (product:Product {productId: row.ProductID})
MERGE (order)-[:PRODUCT]-(product) --------------------------------------------------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |--------------------------------------------------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | order, product, UNNAMED229 | MergePattern || Filter(0) | 0 | 0 | | Property(product,productId) Property(row,ProductID) || NodeByLabel(0) | 0 | 0 | product, product | :Product || Filter(1) | 0 | 0 | | Property(order,orderId) Property(row,OrderID) || NodeByLabel(1) | 0 | 0 | order, order | :Order || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |--------------------------------------------------------------------------------------------------------------------合并设置 我尝试使LOAD CSV脚本尽可能地幂等这样如果我们将更多行或更多列的数据添加到CSV中我们可以重新运行查询而不必重新创建所有内容。 这可以引导您进入以下创建供应商的模式 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MERGE (supplier:Supplier {supplierId: row.SupplierID})
SET supplier.companyName row.SupplierCompanyName 我们要确保只有一个具有该SupplierID的Supplier但是我们可能会逐步添加新属性并决定使用SET命令替换所有内容。 如果我们分析该查询则“渴望”会潜伏 ------------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |------------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph(0) | 0 | 0 | | PropertySet || Eager | 0 | 0 | | || UpdateGraph(1) | 0 | 0 | supplier, supplier | MergeNode; :Supplier || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |------------------------------------------------------------------------ 我们可以使用“ ON CREATE SET”和“ ON MATCH SET”以一些重复的代价来解决此问题 USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv AS row
MERGE (supplier:Supplier {supplierId: row.SupplierID})
ON CREATE SET supplier.companyName row.SupplierCompanyName
ON MATCH SET supplier.companyName row.SupplierCompanyName ---------------------------------------------------------------------| Operator | Rows | DbHits | Identifiers | Other |---------------------------------------------------------------------| EmptyResult | 0 | 0 | | || UpdateGraph | 0 | 0 | supplier, supplier | MergeNode; :Supplier || Slice | 0 | 0 | | { AUTOINT0} || LoadCSV | 1 | 0 | row | |--------------------------------------------------------------------- 使用我一直在使用的数据集在某些情况下可以避免OutOfMemory异常而在其他情况下可以将运行查询所花费的时间减少3倍。 随着时间的流逝我希望所有这些情况都将得到解决但是从Neo4j 2.1.5开始这些是我已经确定过急的模式。 如果您知道其他任何人请告诉我我可以将其添加到帖子中或撰写第二部分。 翻译自: https://www.javacodegeeks.com/2014/10/neo4j-cypher-avoiding-the-eager.htmlneo4j cypher