The video demo was produced by our Head of Blockchain technology, Leo Liang, but all members of our Testnet Telegram Group will have access to testing as well. Interested? Click here to join.
How Can GNY’s Decentralized Machine Learning Be Applied to Covid-19 Analysis?
COVID-19 is an infectious disease caused by a new coronavirus. The entire world is looking for ways to more effectively collaborate, share, record and analyse data to aid those seeking a solution to the Covid-19 pandemic.
This demo shows how interested groups can securely work together to share sensitive data, securely analyse it on-chain, then access and share the results with key stakeholders.
The Challenge: The Covid-19 pandemic is straining the public health system beyond its limits. To demonstrate how GNY’s machine learning blockchain can be used collaboratively and securely to guide public health decisions, we hypothecised how the sharing of data by multiple stakeholders, but in a decentralised way, can provide real value and powerful results.
Our Hypothetical: Individual states in the US want a secure data system in place to upload their daily mortalities and other data to run machine learning analysis, including comparative analysis. Instead of transferring their data to a central location, these datasets are held by states individually but shared securely with other states on this sidechain. When datasets are updated, the sidechain ensures that the updates are verified and all the datasets are automatically kept updated on the sidechain. We can imagine this hypothetical being applied to hospitals, and other institutions, but we selected states because they have the best available data via John Hopkins University.
In our hypothetical scenario these states decide to use GNY’s decentralised machine learning blockchain to analyse their cities’ daily cumulative Covid-19 related mortalities with the following goals:
● Consistency: Agree upon common formatting and data structures for all states so that data is consistent and provides the highest quality ML results.
● Security: Share and store their daily Covid-19 data in a secure and confident manner.
● Collaboration: Share this data together with the algorithmic models, with other states to securely run ML analysis on-chain.
● Analysis: Access comparative analysis on how their states are tackling the disease vs. other states in the US.
● Staying current: Streamline the process of maintaining up-to-date data. By updating data on their individual node, every state on the private sidechain will automatically receive the updated information keeping ML analysis current and meaningful with minimal effort.
● Audit: The blockchain retains a historical time series record of when data is added or updated, when models are modified or updated, and what predictive results are created, throughout the life of the blockchain, creating an indelible data audit trail.
Based on these parameters, our team decentralised an SVM (Support Vector Machines) algorithm on chain. The SVM is fed prepared data corresponding to individual cities’ time-series Covid-19 mortality data. The SVM model predicts and identifies outliers, in this case cities whose infection rate may either be increasing or decreasing faster than their peers based on where they are in their infection cycle. These results can be viewed as an early detection system. Instead of individual states looking at raw numbers, they are instead receiving results that have analysed rate and trajectory in collaboration with other states battling the same pandemic.
In this example, the decentralization of the SVM algorithm enables the processing of the decentralised datasets by looking across the different US cities, and compares one city against the others. It can alert states to either positive or negative deviations from the current norm potentially before statistical models can. It may allow states to increase or roll-back certain pandemic control measures in a more timely manner. This could be one of thousands of analyses states and the country could perform. Now more than ever, we can all benefit from decentralised, secure models to collaborate and solve big-data driven problems.
The bracketed data on the right represents actual cumulative COVID-19 deaths from May 17th, and 18th for 6 sample US cities. . The positive or negative integer above them is the result of the SVM analysis. Positive ones represent cities that are following a pattern that is being observed in most other US cities, in other words progressing in a standard vector, and are represented in red.
The cities showing a negative one values are the outliers or cities whose mortality rates are changing outside the range of their vector are flagged as outliers (blue).
3148 US cities were analyzed using the decentralized SVM model, and 283 were shown to be significant outliers, with a 20%+ variation from the overall vector shown by all cities.
For the purposes of this example, we have anonymized the city data.
Please leave your questions in the comments below or join us in our telegram to join the conversation.