Get started
Bring yourself up to speed with our introductory content.
Get started
Bring yourself up to speed with our introductory content.
15 ways AI influences the data management landscape
AI, NLP and machine learning advancements have become core to data management processes. Ask tool vendors how they use -- or fail to use -- AI in these 15 areas. Continue Reading
How to create a data quality management process in 5 steps
Data quality requires accurate and complete data that fits task-based needs. These five steps establish a data quality management process to ensure data fits its purpose. Continue Reading
Hadoop
Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications in scalable clusters of computer servers. Continue Reading
-
data lakehouse
A data lakehouse is a data management architecture that combines the key features and the benefits of a data lake and a data warehouse. Continue Reading
How to approach data mesh implementation
Data mesh takes a decentralized approach to data management, setting it apart from data lakes and warehouses. Organizations can transition to data mesh with progressive steps. Continue Reading
data analytics (DA)
Data analytics (DA) is the process of examining data sets to find trends and draw conclusions about the information they contain.Continue Reading
data pipeline
A data pipeline is a set of network connections and processing steps that moves data from a source system to a target location and transforms it for planned business uses.Continue Reading
master data management (MDM)
Master data management (MDM) is a process that creates a uniform set of data on customers, products, suppliers and other business entities from different IT systems.Continue Reading
Essential skills for data-centric developers
To become more data-driven, organizations need data-centric developers. Developers can learn a mix of technical and interpersonal skills to be an attractive candidate for the job.Continue Reading
Data steward responsibilities fill data quality role
Data stewards tie together data operations. From quality to governance to boosting collaboration, data stewards are valuable members of any data effort.Continue Reading
-
Enhance data governance with distributed data stewardship
Data stewardship and distributed stewardship models bring different tools to data governance strategies. Organizations need to understand the differences to choose the best fit.Continue Reading
Data lakes: Key to the modern data management platform
Data lakes influence the modern data management platform at all levels. Organizations can gain faster insights, save costs, improve governance and boost self-service data access.Continue Reading
data integration
Data integration is the process of combining data from multiple source systems to create unified sets of information for both operational and analytical uses.Continue Reading
transcription error
A transcription error is a type of data entry error commonly made by human operators or by optical character recognition (OCR) programs.Continue Reading
MongoDB
MongoDB is an open source NoSQL database management program.Continue Reading
Data stewardship: Essential to data governance strategies
As data governance gets increasingly complicated, data stewards are stepping in to manage security and quality. Without one, organizations lose speed, quality info and opportunity.Continue Reading
data warehouse
A data warehouse is a repository of data from an organization's operational systems and other sources that supports analytics applications to help drive business decision-making.Continue Reading
database (DB)
A database is a collection of information that is organized so that it can be easily accessed, managed and updated.Continue Reading
18 top big data tools and technologies to know about in 2023
Numerous tools are available to use in big data applications. Here's a look at 18 popular open source technologies, plus additional information on NoSQL databases.Continue Reading
database replication
Database replication is the frequent electronic copying of data from a database in one computer or server to a database in another -- so that all users share the same level of information.Continue Reading
DataOps
DataOps is an Agile approach to designing, implementing and maintaining a distributed data architecture that will support a wide range of open source tools and frameworks in production.Continue Reading
Comparing DBMS vs. RDBMS: Key differences
A relational database management system is the most popular type of DBMS for business uses. Find out how RDBMS software differs from other DBMS technologies.Continue Reading
What is data management and why is it important?
Data management is the process of ingesting, storing, organizing and maintaining the data created and collected by an organization, as explained in this in-depth look at the process.Continue Reading
data mesh
Data mesh is a decentralized data management architecture for analytics and data science.Continue Reading
data observability
Data observability is a process and set of practices that aim to help data teams understand the overall health of the data in their organization's IT systems.Continue Reading
What key roles should a data management team include?
These 10 roles, with different responsibilities, are commonly a part of the data management teams that organizations rely on to make sure their data is ready to use.Continue Reading
Data tenancy maturity model boosts performance and security
A data tenancy maturity model can boost an organization's data operations and help improve the protection of customer data. Improvement is tracked through tiers of data tenancy.Continue Reading
data quality
Data quality is a measure of the condition of data based on factors such as accuracy, completeness, consistency, reliability and whether it's up to date.Continue Reading
What is a data warehouse analyst?
Data warehouse analysts help organizations manage the repositories of analytics data and use them effectively. Here's a look at the role and its responsibilities.Continue Reading
Data observability benefits entire data pipeline performance
Data observability benefits include improving data quality and identifying issues in the pipeline process, but also has challenges organizations must solve for success.Continue Reading
OPAC (Online Public Access Catalog)
An OPAC (Online Public Access Catalog) is an online bibliography of a library collection that is available to the publicContinue Reading
primary key (primary keyword)
A primary key, also called a primary keyword, is a column in a relational database table that's distinctive for each record.Continue Reading
How to reap the benefits of data integration, step by step
A new book lays out a strong case for data integration and guides readers in how to carry out this essential process.Continue Reading
How to build an effective DataOps team
More organizations are turning to DataOps to bolster their data management operations. Learn how to build a team with the right people to ensure DataOps success.Continue Reading
data warehouse as a service (DWaaS)
Data warehouse as a service (DWaaS) is an outsourcing model in which a cloud service provider configures and manages the hardware and software resources a data warehouse requires, and the customer provides the data and pays for the managed service.Continue Reading
database as a service (DBaaS)
Database as a service (DBaaS) is a cloud computing managed service offering that provides access to a database without requiring the setup of physical hardware, the installation of software or the need to configure the database.Continue Reading
data catalog
A data catalog is a software application that creates an inventory of an organization's data assets to help data professionals and business users find relevant data for analytics uses.Continue Reading
Key roles and responsibilities of the modern chief data officer
Chief data officer roles and responsibilities are expanding beyond data strategy, as they are increasingly tasked with cultivating a data-driven culture.Continue Reading
database administrator (DBA)
A database administrator (DBA) is the information technician responsible for directing or performing all activities related to maintaining a successful database environment.Continue Reading
What is data lineage? Techniques, best practices and tools
Organizations can bolster data governance efforts by tracking the lineage of data in their systems. Get advice on how to do so and how data lineage tools can help.Continue Reading
The evolution of the chief data officer role
Chief data officers are taking on additional responsibilities beyond data management as they strive to transform organizations' data culture and focus on value creation.Continue Reading
database management system (DBMS)
A database management system (DBMS) is system software for creating and managing databases, allowing end users to create, protect, read, update and delete data in a database.Continue Reading
Cloud DBA: How cloud changes database administrator's role
Cloud databases change the duties and responsibilities of database administrators. Here's how the job of a cloud DBA differs from what an on-premises one does.Continue Reading
data classification
Data classification is the process of organizing data into categories that make it is easy to retrieve, sort and store for future use.Continue Reading
10 trends shaping the chief data officer role
As data use increases and organizations turn to business intelligence to optimize information, these 10 chief data officer trends are shaping the role.Continue Reading
DBMS keys: 8 types of keys defined
Here's a guide to primary, super, foreign and candidate keys, what they're used for in relational database management systems and the differences among them.Continue Reading
How to build a data catalog: 10 key steps
A data catalog helps business and analytics users explore data assets, find relevant data and understand what it means. Here are 10 important steps for building one.Continue Reading
What is data governance and why does it matter?
Data governance (DG) is the process of managing the availability, usability, integrity and security of the data in enterprise systems, based on internal data standards and policies that also control data usage.Continue Reading
How to evaluate and optimize data warehouse performance
Organizations build data warehouses to satisfy their information management needs. Data warehouse optimization can help ensure that these warehouses achieve their full potential.Continue Reading
6 key steps to develop a data governance strategy
Data governance shouldn't be built around technology, but the other way around. Existing infrastructure, executive support, data literacy, metrics and proper tools are essential.Continue Reading
7 best practices for successful data governance programs
A comprehensive, companywide data governance program strengthens data infrastructure, improves compliance initiatives, supports strategic intelligence and boosts customer loyalty.Continue Reading
3 considerations for a data compliance management strategy
A data compliance management strategy is key for organizations to protect data the right way. Different positions have responsibility to ensure industry regulations are met.Continue Reading
Data Dredging (data fishing)
Data dredging -- sometimes referred to as data fishing -- is a data mining practice in which large data volumes are analyzed to find any possible relationships between them.Continue Reading
5 key elements of data tenancy
Data tenancy is a key piece of any data protection scheme and can be crafted around five building blocks to provide safe, secure data access to users.Continue Reading
data stewardship
Data stewardship is the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner.Continue Reading
What a big data strategy includes and how to build one
Companies analyze stores of big data to improve how they operate. But those efforts will bring diminishing returns without a big data strategy. Here's how to build one.Continue Reading
How big data collection works: Process, challenges, techniques
Taming large amounts of data from multiple sources and deriving the greatest value to ensure trusted business decisions hinge on a foolproof system for collecting big data.Continue Reading
Structured Query Language (SQL)
Structured Query Language (SQL) is a standardized programming language that is used to manage relational databases and perform various operations on the data in them.Continue Reading
Self-service data preparation: What it is and how it helps users
Using self-service tools to properly prepare data simplifies analytics and visualization tasks for business users and speeds complex modeling processes for data scientists.Continue Reading
data validation
Data validation is the practice of checking the integrity, accuracy and structure of data before it is used for a business operation.Continue Reading
data profiling
Data profiling refers to the process of examining, analyzing, reviewing and summarizing data sets to gain insight into the quality of data.Continue Reading
data preprocessing
Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure.Continue Reading
data cleansing (data cleaning, data scrubbing)
Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.Continue Reading
big data
Big data is a combination of structured, semistructured and unstructured data collected by organizations that can be mined for information and used in machine learning projects, predictive modeling and other advanced analytics applications.Continue Reading
data modeling
Data modeling is the process of creating a simplified diagram of a software system and the data elements it contains, using text and symbols to represent the data and how it flows.Continue Reading
tree structure
A tree data structure is an algorithm for placing and locating files (called records or keys) in a database.Continue Reading
data mart (datamart)
A data mart is a repository of data that is designed to serve a particular community of knowledge workers.Continue Reading
data lake
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications.Continue Reading
compliance
Compliance is the state of being in accordance with established guidelines or specifications, or the process of becoming so.Continue Reading
dark data
Dark data is digital information an organization collects, processes and stores that is not currently being used for business purposes.Continue Reading
Google Cloud Spanner
Google Cloud Spanner is a distributed relational database service designed to support global online transaction processing deployments, SQL semantics, horizontal scaling and transactional consistency.Continue Reading
semantic technology
Semantic technology is a set of methods and tools that provide advanced means for categorizing and processing data, as well as for discovering relationships within varied data sets.Continue Reading
Apache Flink
Apache Flink is a distributed data processing platform for use in big data applications, primarily involving analysis of data stored in Hadoop clusters.Continue Reading
Apache Spark
Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers.Continue Reading
multimodel database
A multimodel database is a data processing platform that supports multiple data models, which define the parameters for how the information in a database is organized and arranged.Continue Reading
data silo
A data silo exists when an organization's departments and systems cannot, or do not, communicate freely with one another and encourage the sharing of business-relevant data.Continue Reading
data fabric
A data fabric is an architecture and software offering a unified collection of data assets, databases and database architectures within an enterprise.Continue Reading
data architect
A data architect is an IT professional responsible for defining the policies, procedures, models and technologies to be used in collecting, organizing, storing and accessing company information.Continue Reading
NoSQL database types explained: Column-oriented databases
Learn about the uses of column-oriented databases and the large data model, data warehouses and high-performance querying benefits the NoSQL database brings to organizations.Continue Reading
How to choose exactly the right data story for your audience
A data practitioner has two jobs: tell the right data story and in the right way to win over project stakeholders, data expert Larry Burns says in his latest book.Continue Reading
Quiz: Test your understanding of the Hadoop ecosystem
This quiz will test your knowledge of Hadoops basics including framework, capabilities and related technologies.Continue Reading
stream processing
Stream processing is a data management technique that involves ingesting a continuous data stream to quickly analyze, filter, transform or enhance the data in real time.Continue Reading
7 data modeling techniques and concepts for business
Three types of data models and various data modeling techniques are available to data management teams to help convert data into valuable business information.Continue Reading
9 steps to a dynamic data architecture plan
Learn the nine steps to a comprehensive data architecture plan, including C-suite support, data personas, user needs, governance, catalogs, SWOT, lifecycles, blueprints and maps.Continue Reading
How to build a successful cloud data architecture
As enterprises vacate the premises and migrate their operations skyward, a cloud data architecture can provide the long-term flexibility to improve workflows, costs and security.Continue Reading
columnar database
A columnar database is a database management system (DBMS) that stores data in columns instead of rows.Continue Reading
relational database
A relational database is a collection of information that organizes data points with defined relationships for easy access.Continue Reading
Db2
Db2 is a family of database management system (DBMS) products from IBM that serve a number of different operating system (OS) platforms.Continue Reading
flat file
A flat file is a collection of data stored in a two-dimensional database in which similar yet discrete strings of information are stored as records in a table.Continue Reading
hashing
Hashing is the process of transforming any given key or a string of characters into another value.Continue Reading
big data management
Big data management is the organization, administration and governance of large volumes of both structured and unstructured data.Continue Reading
spatial data
Spatial data is any type of data that directly or indirectly references a specific geographical area or location.Continue Reading
query
A query is a question or a request for information expressed in a formal manner. In computer science, a query is essentially the same thing, the only difference is the answer or retrieved information comes from a database.Continue Reading
schema
In computer programming, a schema (pronounced SKEE-mah) is the organization or structure for a database, while in artificial intelligence (AI) a schema is a formal expression of an inference rule.Continue Reading
star schema
A star schema is a database organizational structure optimized for use in a data warehouse or business intelligence that uses a single large fact table to store transactional or measured data, and one or more smaller dimensional tables that store ...Continue Reading
information
Information is stimuli that has meaning in some context for its receiver. When information is entered into and stored in a computer, it is generally referred to as data.Continue Reading
RFM analysis (recency, frequency, monetary)
RFM analysis is a marketing technique used to quantitatively rank and group customers based on the recency, frequency and monetary total of their recent transactions to identify the best customers and perform targeted marketing campaigns.Continue Reading
denormalization
Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance of the database.Continue Reading
raw data (source data or atomic data)
Raw data (sometimes called source data, atomic data or primary data) is data that has not been processed for use.Continue Reading
Building a big data architecture: Core components, best practices
To process the infinite volume and variety of data collected from multiple sources, most enterprises need to get with the program and build a multilayered big data architecture.Continue Reading