Graph databases: The journey from flat earth to metaverse

For you

Be part of something bigger, join the Chartered Institute for IT.

Rangan continues: ‘The language that we developed to query the data, it is one of the most intuitive languages around. It’s called cypher and everyone at Neo4j understands it, whether they’re in IT, marketing, HR or sales. We have a visual browser that makes it super intuitive to use.

‘When it comes to data science and AI/ML that’s a whole new area which is super interesting because unlike in databases where you have DBAs and developers, now data scientists are developers, who build AI/ML models. The exciting thing is that this is accessible to a majority of data scientists that haven’t trained as data scientists.’

Taking the science out of data scientist

A simplified language, a simplified way of connecting the information together is making the role of the data scientist not only easier, but more accessible. It’s taking the need to be a true scientist out of the data analysis role. Neo4j’s graph databases are also designed to work with the most common demands of the user. Rangan continues: ‘We built in 60+ ML algorithms into the product so the common questions that you may want to ask, are translated with those algorithms to develop the model we have.’

Maximising the database through ML

While we welcome innovation, is there a fear that using machine learning to manipulate data may amplify any bias in the machine? Should we be worried about “inference” in collating the data?

Rangan continues: ‘When you load data on the database for the first time there could be gaps, especially when you take table-based data where relationships are inferred and then you try to put those relationships in a graph database, then there is a chance for, “hey, do I really know the relationship? Should I infer something about it?” So, there’s a trade-off between, “do I make sure the loading is perfect” versus “it’s not”.

‘The moment we hit a gap in the data we call it a ‘hanging relationship’ and we will flag that relationship back to the user and say “how do you want to deal with it?” We avoid inferring a relationship because our point of view is very straightforward, an inferred relationship with low data is corrupt data.’

A tool for data and forecasting

While there may be reassurance that the data isn’t invented or misinterpreted, there is room for hypothesis within the set up and for the database not just to be a cataloguing tool, but also the basis for an efficient forecasting tool. Rangan explains: ‘The thing to understand is the graph database itself does not create synthetic data and we do not do best guesses – that is a technique that is used by the developer or the data scientist to create a hypothesis.

‘If you look at digital twinning, it’s an alternative way to model outcomes. If you have a digital model of the physical network, you can take the digital model and do “what if” analyses. So, you can explore what if the data looked like this? Which is really “what if the world looked like this? What could happen?” So, it gives you predictive power, based on “what if” scenarios. But it’s entirely up to the practitioner what those “what-if” scenarios are, and therefore if those scenarios happen how could the network interact? Will it break down? At what point does it break down? How can I fix it and how can I plan for it? So, it becomes a very powerful planning tool.’

Is there space for Neo4j to jump into the metaverse?

The graph database seems to live in a 3D space – is there a natural progression for Neo4j to move into the metaverse? And what might that evolution look like? Rangan thinks: ‘It could underpin the types of things that you could do in a metaverse. We have a customer in Asia who actually uses VR glasses to do visualisation and data exploration. So that is definitely a fascinating direction to take. It presents a whole bunch of possibilities, again back to the notion of intuitive mental models that we have about the world and how we are able to represent it.

For Rangan, the future of this Moore’s Law style database evolution is very bright—especially in the world of AI. ‘Gartner predicts that in three- or four-years, 80% of ML models will be based on graph technology. And the reason is a deep neural learning network is fundamentally a graph. But rather than having table based data that gets translated into a graph, graph databases are natively graph, so it’s a much faster process.’