PHEMI Central is a new class of data warehouse that uses big data technologies to handle any volume and variety of data, while providing advanced features for big data privacy, management and governance—all built right in.
PHEMI Central can collect any kind of data—structured (such as database records), semi-structured (such as Microsoft Excel, machine-collected data, or genomic files), or unstructured (such as images or documents). Data can be ingested and aggregated from multiple disparate sources. During collection, PHEMI Central tags each raw data object with metadata, then stores the tagged data in PHEMI’s fast, powerful Smart Data Store. The Smart Data Store is key-value-based and schemaless, so data can be ingested at sub-second rates without complex, time-consuming, and brittle schema-mapping exercises.
The PHEMI Central Big Data Warehouse automatically indexes and catalogs all ingested data. Next, user-specifiable Data Processing Functions cleanse, parse, and structure the data—transforming the tagged raw data into analytics-ready digital assets. All data is cataloged and indexed for linking based on key words, graph relationships, and geospatial attributes. Aggregates are computed to accelerate anaytics and application performance. This lets users find and consume specific digital assets at sub-second speed across petabytes of data.
With PHEMI Central, you can build datasets on demand without having to manage multiple data marts or complex MapReduce or YARN processes. Whether the dataset is consumed by a user or exported to spreadsheet, application, or analytics tool, PHEMI Central strictly enforces your organization’s big dataprivacy and security policies to ensure appropriate access to data.
PHEMI Central can automatically de-identify, encrypt, or mask personal information and enforce privacy based on sophisticated user access privileges and fine-grained sharing and consent rules. With a Privacy by Design framework at its core, PHEMI Central helps you achieve your organization’s governance objectives.
PHEMI Central incorporates advanced data management features such as version control, rollback, and retention rules. A sophisticated metadata framework allows information to be managed at the field level throughout its lifecycle. This family of features brings the data management capabilities of enterprise-grade traditional data warehouses to big data.
PHEMI Central runs on commercial servers and commodity disk drives, driving down hardware costs and allowing the system to scale from terabytes to petabytes without expensive Storage Area Network(SAN)/Network Attached Storage (NAS) costs or performance bottlenecks. Automatic replication and load balancing means data is always available and performance is optimized across system nodes.
In many cases, the path that organizations take to implement big data starts by piecing together a stack of open-source Hadoop components. And then— what next? When embarking on a big data project, it is important for organizations to understand exactly what they are getting into, pay attention to the challenges that have derailed others’ projects, and monitor the impact on total cost of ownership (TCO).View all White Papers