What is DataStar?
DataStar is PSI’s Data Normalization platform.
- Make enterprise data interoperable and accurate.
- Correlate information from many sources to determine the existence and root causes of data inconsistencies
- find and solve problems clients unaware of and vulnerable to.
- automatic extraction from documents to generate cross-referenced architecture products: data models, code dictionaries, glossaries, system models
- collect and correlate human knowledge from both business and technical people that is merged with other information
- actual data profiles and use from operational data systems.
- All of this information is maintained and available for continuous inspection, analysis, and collaboration
- Provide unprecedented visibility into true data environment compared to typical architecture, metadata, and data management tools which fail to mesh the design with the reality of operational data.
What Does DataStar Do?
It uses a combination of advanced methods to identify and solve actual data inconsistencies without the extraordinarily slow and expensive techniques that have stalled prior efforts and which are widely recognized as unable to meet the challenges of making enterprise data truly interoperable and accurate. PSI’s technology correlates information from many sources to determine the existence and root causes of data inconsistencies, almost always finding significant problems within the first week that the client was unaware of and highly vulnerable to. These sources include: automatic extraction from a variety of document formats to generate cross-referenced and correlated architecture products such as data models, code dictionaries, and glossaries; collection and correlation of human knowledge from both business and technical people during the course of normal meetings that is merged with other information to specify the actual scope, use cases, ownership, and context of data origination and use; actual data profiles and use from operational data systems.
All of this information is maintained and available for continuous inspection, analysis, and collaboration in the PSI DataStar Discovery web-based environment. This provides unprecedented visibility into the true data environment compared to typical architecture, metadata, and data management tools which fail to mesh the design with the reality of operational data.
The PSI approach has proven its ability to solve these challenges in a much more rapid and collaborative manner than any other system or method. It is the unique combination of visible documentation and data use context, guided analysis, knowledge collection, and reverse engineering of operational data that enables solutions in situations other tools and methods, most notably standard integration and modeling, fail.
With the PSIKORS analysis results, PSI’s technology produces common normalized data within a flexible services data environment that enables legacy and modernized applications to perform using accurate, clearly defined data. This enables graceful consolidation and retirement of legacy systems, and a low-cost, high-performance environment to supply governance approved, visibly managed, accurate data to all users. This is an automated process so updated business rules or needs can be managed in the DataStar system and a newly updated production data set generated the same day, all with open visible oversight of managers and users across the data supply chain.
This portion of the PSIKORS technology is the DataStar Unifier module that uses a hybrid approach of standard data tables to support governance and existing personnel skills sets, with a NoSQL physical data execution to exploit the immense speed, performance, and low-cost of this new technology. This data normalization can be used either in batch or transaction mode. It can run on existing hardware and database servers, or use the tailored PSI PSIKLOPS hardware with its low-cost Hadoop environment enabling high-performance parallel processing with commodity hardware costs.
DataStar Unifier manages business logic to merge and correct data in complex business scenarios. It manages both the rules and knowledge of why and how to normalize mission-critical data, and the executable code that performs the normalization in high powered, small-cluster Hadoop. You only focus on the business logic since PSI supplies the libraries and Hadoop client controls to do all interfacing and runtime execution of parallel processing in low-cost Hadoop computing.