Data analysis

Systematic examination, interpretation, and transformation of raw data into meaningful insights, patterns, and trends. Part of monitoring and other crucial business processes. It involves the application of various statistical, mathematical, or other expert methods to uncover relationships, draw inferences, and derive valuable information from datasets. Unlike data collection, which focuses on the gathering of information, data analysis centers on exploring, organizing, and interpreting data to reveal underlying patterns or relationships that can inform decision-making and support the achievement of specific objectives.

Close terminology

Examples of synonimes and close terminology:

Data Exploration The preliminary phase of data analysis involving the examination and summary of key characteristics of the dataset.

Inferential Statistics Statistical techniques that make predictions or inferences about a population based on a sample of data.

Hypothesis Testing The process of assessing the validity of a claim or hypothesis about a population parameter using statistical methods.

Data Visualization The representation of data through charts, graphs, or other visual elements to facilitate understanding and insights.

Outlier Detection Identifying data points that deviate significantly from the overall pattern in a dataset.

Predictive Modeling Building models to predict future outcomes or trends based on historical data.

Cross-Validation A technique used to assess the performance of a predictive model by partitioning the data into subsets for training and testing.

Requirement

Questions or objectives that guide the analysis and help determine the appropriate approach, methods and indicators. The questions ensure that the analysis is aligned with the goals of the analysis.

The data analysis needs requirements as the process input, e.g. from goal-setting , or ad hoc decision . It can also form requirements as an output for other processes, like: planning , decision or change management .

For improving the monitoring of the electronic environment of the state administration, in cooperation with CERT.LV, the Ministry shall develop criteria to identify institutions where CERT.LV should deploy security sensors and shall develop a strategy for broader installation and use of security sensors-appropriate and sufficient information about the national IS and related ICT infrastructure is a prerequisite for planning, determining and monitoring uniform principles of IS accessibility and ICT continuity management
Can we rely on the access to IS and the receipt of e-services? LRVK - Latvia 2022

The eAddress is only one of the electronic communication channels, and there is no detailed monitoring approach and indicators provided for clear identification whether eAddress messages replace existing paper documents with subsequent actual cost savings or other electronic communication channels with corresponding additional quality benefits but already without significant cost savings.
Does the country ensure effective use of the official electronic address in communication with individuals and businesses? LRVK - Latvia 2021

Problem Formulation

Crucial process of devising a data science solution to a business problem. Its purpose can be identification of crucial elements, opportunities and risks, prediction , optimization of processes etc.

information on problems in ensuring the level of accessibility is not collected in a centralized way in the country, as nor has the causes and consequences of the problem of not reaching the specified level of accessibility been analysed. When not all the identified problems are recorded and their causes are not evaluated, providing reasonable proposals for improvements is impossible, hence, state institutions continue to maintain e-services in the long-term, but nothing contributes to improving their accessibility
Can we rely on the access to IS and the receipt of e-services? LRVK - Latvia 2022

Modelling

Creating a simplified representation of a complex system or dataset to understand its structure and handle its crucial elements.

Hypothesis

Specific statement or assumption that is tested during the analysis. Hypotheses provide a framework for focused analysis, easier calibration and interpretation of results. At the same time it is connected with risk of research bias.

Data Definitions

Features or variables of items in the dataset, including their units of measurement, types and metadata, relationships, and significance. Understanding data features, including their relevance and availability, which is supported by domain knowledge , crucial for accurate interpretation and contextual analysis. In technical sense, data structures can be re-defined by applying normalization (reducing redundancy) or denormalization (increasing performance by introducing redundancy).

INs and OUTs (section under development)

coming in

going out

Controls to review

regulation, documentation, reports