Transaction enforcement properties of a database: atomicity, consistency, isolation, durability.
Batch processing is a method of processing data and tasks in bulk, where a group of similar operations or computations are executed together as a batch.
Data analytics involves examining, transforming, and interpreting data sets to uncover meaningful insights, patterns, and trends.
Data applications are software programs or systems designed to gather, process, analyze, and present data in a user-friendly and meaningful way.
Data engineering refers to designing, building, and maintaining the architecture required to collect, transform, and store data.
Data governance is the set of practices that ensure the appropriate use of an organization's data assets.
Data lineage refers to the documented and visual representation of the path and transformations that data undergoes.
Data modeling creates a structured and visual representation of how data is organized, related, and stored within a database or information system.
Data sharing is the deliberate and controlled practice of exchanging or providing access to data between systems.
Data transformation is converting, altering, or reformatting raw data from one state or structure into another.
A data warehouse is a centralized repository that stores historical and current data from various sources within an organization.
dbt (data build tool) is an open-source command-line and analytics engineering workflow tool.
Feature engineering is the process of selecting within a dataset to enhance machine learning models' performance and predictive capabilities.
Generative AI Apps
Generative AI apps are applications powered by artificial intelligence algorithms that can autonomously create content.
Google BigQuery is a fully managed, serverless data warehousing and analytics platform offered by Google Cloud.
HIPAA - Health Insurance Portability and Accountability Act
The Health Insurance Portability and Accountability Act (HIPAA) is a regulatory framework in the healthcare vertical within the United States.
LLMs - Large Language Models
Large Language Models (LLMs) are advanced artificial intelligence systems.
Machine Learning is a branch of artificial intelligence that involves the development of algorithms.
MLOps - Machine Learning Operations
MLOps aims to ensure efficient collaboration, enabling the seamless integration of machine learning into production systems.
Microsoft Azure is a comprehensive cloud computing platform and set of services provided by Microsoft.
Parquet is an open-source columnar file format optimized for efficiently storing and processing large datasets.
PHI - Protected Health Information
Protected Health Information (PHI) refers to any identifiable health-related data.
PII - Personally Identifiable Information
Personally Identifiable Information (PII) refers to any data that can be used to uniquely identify, locate, or contact an individual.
Power BI is a business analytics and data visualization platform developed by Microsoft.
Sales analytics uses data analysis and statistical techniques to examine sales-related information, trends, and patterns.
Semi-structured data refers to information that does not fit neatly into a rigid database but contains some level of organization or hierarchy.
Sigma Computing is a cloud-based analytics and business intelligence platform that empowers non-technical users.
Snowflake is a cloud-based data warehousing platform that provides scalable and flexible solutions for large volumes of data.
Snowpark is a feature within the Snowflake Data Cloud platform.
Snowpipe is an automated data ingestion feature within the Snowflake Data Cloud platform.
Snowpipe Streaming is a feature within the Snowflake Data Cloud platform that enables continuous and real-time data ingestion from streaming sources.
SQL to Snowflake Migration
SQL to Snowflake migration refers to transferring and adapting existing database systems, applications, or data workloads that use SQL to Snowflake.
A star schema is a database schema used in data warehousing and analytics.
Streaming data refers to a continuous flow of real-time, time-sensitive data from various sources such as sensors, devices, social media, or applications.