It’s widely recognized that data is the fundamental building block of modern artificial intelligence. It’s data that helps AI algorithms learn, adapt, and make predictions. As a result, a popular refrain often uttered in the context of AI is “there’s no data like more data.” The approach that sometimes derives from this statement is to just throw lots of data into a computer and hope for the best. But in practice, the AI-Data relationship is a bit more complicated. One needs to present the right data, in the correct format, at an appropriate quantity to AI algorithms if one expects to get accurate and useful output.
Why is the AI-Data relationship complicated, you ask? Well, here are just a few of the many ways we qualify our data:
Big data. Meta data. Raw data. Unstructured data. Transaction data. Customer data. Proprietary data. Open data. Semantic data. Labeled data.
And here are just a few of the many ways we describe the things we do to or with our data:
Data processing. Data integration. Data privacy. Data management. Data governance. Data security. Data quality. Data discovery. Data sharing.
At SineWave Ventures, data is an important focus area of our thesis-driven investment strategy.
We look for and invest in solutions that implement data management and data governance operations. These solutions categorize and tag big (or meta, or raw, or unstructured…) data, locating and labeling homogenous subsets of data that can be appropriately and reliably shared with human or machine actors. Improved data quality and privacy can improve the efficiency, accuracy, and usefulness of AI algorithms.
We also look for and invest in solutions that flip the data paradigm described above, categorizing and tagging big (or meta, or raw, or unstructured, …) data to locate and identify a more heterogenous superset of data associated with a known data kernel. This kind of solution enables data discovery and data integration, leading to improved accuracy and outcomes for those AI systems designed to leverage a broader data context.
Our thesis is that data is the driving force of business, and the ability to harness one’s data and to use it in effective decision-making will become the make-or-break success criteria for companies.