Traditionally, the data cleansing is defined as the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.
In recent years, sensor networks have gained wide popularity in variety of application scenarios, ranging from monitoring application in manufacturing production lines to more sophisticated sensor deployments in the research and development scenarios such as autonomous driving in the automotive industry. The metadata gathered during the generation of the massive amount of sensor data sets is playing a more important role, because it provides key attributes information so that the big data set can be strategically managed and prepared for analysis.
Metadata is the data about data and can be regarded as the properties of the data. Once the data has been acquired, the associated metadata becomes equally important. In general, it is common to see these types of metadata post the data acquisition process.
The DataLook module of the Viviota’s Time-to-Insight (TTI) software suite is based on NI’s DataFinder technology, which is an indexing service that parses any custom file format for descriptive information (metadata) and creates a database of the descriptive information within the target data files. This database is automatically updated as soon as a valid data file is created, deleted or edited. Once the metadata is indexed, with the help of DataPlugins, which map custom file formats onto the TDM model, the DataFinder search looks at all of the metadata at the file, channel group and channel level based on user specified search criteria.
In order for the DataLook module to rapidly and efficiently find the needed data sets for analysis, the TTI workflow goes through the DataPrep module of first, which is dedicated to the data cleansing tasks. The tasks typically include:
In general, the TTI software suite shortens the time-consuming tasks in test data management that once took days to now happen in seconds, which improves the efficiency and reduce the product time-to-market significantly.