Machine Learning: Making Sense of Messy CRE Data

Guest Author: Jeramiah Carr, Director of Analytics, IMS

In the world of commercial real estate, every activity involves data – spreadsheets with hundreds and even thousands of lines of information. But more often than not, this data is disordered and chaotic. In other words, it’s messy. CRE professionals are now beginning to leverage technology to clean up that messy data, a task which is a top challenge facing analysts today. The aggregation and pre-processing of data, commonly referred to as ‘data wrangling’, accounts for 80 to 90% of any data analytics projects. It’s a time consuming and expensive problem, and technology companies are racing to develop an innovative solution.

Read ahead to learn about what constitutes messy data within the commercial real estate industry, as well as how machine learning can fix the problem of messy data.

What is considered “messy CRE data”?

When people think about messy data, most probably picture 20-tab spreadsheets all structured in different ways or an antiquated database. Although this data appears unorganized and chaotic, the ingestion of Excel spreadsheets is possible with many tools available today. In 1998, Merrill Lynch cited a rule that changes the way we think about what constitutes data, suggesting, “…80 to 90% of all potentially usable information is in an unstructured form”. In other words, data exists all around us in everything we do: the operating agreements we put in place, the news articles we read, social media, the for-sale signs posted on the front of buildings that appear as you walk around Google Maps. All of these sources of information consists of thousands of data attributes, which our brains interpret into information at the moment of consumption. While our brains excel at general pattern recognition, most of us can’t make sense of and can’t consume massive amounts of data at one time. Imagine walking around Google Maps at 60 miles per hour – your brain is not fast enough to process the intricacies of each frame. So how can you leverage technology to turn data into a value-add asset?

Machine Learning to the rescue

Machine Learning (ML) is the precursor to Artificial Intelligence (AI), and as such, with time, will disrupt all industries. The shift that will occur in CRE will be just as significant as the shift that happened when we moved from paper to computers. ML & AI will change everything. The CRE industry has already begun to amass more clean data than ever before, but the majority of it has been a result of manual labor. Soon, extracting information from unstructured sources, such as real estate publications, will allow for datasets large enough to build systems for very meaningful predictions. Machine learning will even help with those Excel spreadsheets, cleaning data via phonetic matching and outlier identification. No longer will a phone number exist in an address field nor will a distribution number be unrealistically out of the range of cash flow.

Who will win?

CRE professionals acknowledge that this is a data-driven industry, albeit data that typically resides in random spreadsheets. If technology is leveraged, the data is more readily available and actionable. Having the best data and algorithms will empower you to identify risks and drive value, thereby increasing NOI faster than your peers and driving competition out of business.

IMS is making big investments in people and technology, positioning our software to deliver predictive analytics and automated alerts based on key performance indicators (KPIs) through machine learning and artificial intelligence in the CRE Fintech space. As a sponsor, the first step in this journey is getting a handle on the data that you have and understanding it. The second step is investing in the software that will deliver you forecasts and insights into the next generation. The IMS Advanced Analytics product is leading the industry, providing dashboards and reporting, and will continue to evolve and drive towards this CRE vision.

