|
At depths 4 and 5, however, you see a significant degradation of performance: a query involving 4 joins takes over 10 seconds to execute, while at depth 5, execution takes way too long—over a minute and a half, although the count result doesn’t change. This illustrates the limitation of MySQL when modeling graph data: deep graphs require multiple joins, which relational databases typically don’t handle too well.
Inefficiency of SQL joins
To find all a user’s friends at depth 5, a relational database engine needs to generate the Cartesian product of the t_user_friend table five times. With 50,000 records in the table, the resulting set will have 50,0005 rows (102.4 × 1021), which takes quite a lot of time and computing power to calculate. Then you discard more than 99% to return the just under 1,000 records that you’re interested in!
As you can see, relational databases are not so great for modeling many-to-many relationships, especially in large data sets. Neo4j, on the other hand, excels at many-to-many relationships, so let’s take a look at how it performs with the same data set. Instead of tables, columns, and foreign keys, you’re going to model users as nodes, and friendships as relationships between nodes. |
|