Data Processing: May 2014

Objectives:

· Describe the relationship between the data to wisdom continuum and database systems.

· Explain file structures and database models

· Describe the purpose, structures, and functions of database management systems (DBMSs).

· Outline the life cycle of a database system.

· Explain concepts and issuses related to data warehouses in healthcare.

· Describe the knowledge discovery in data bases process (KDD) including data mining.

Introduction

Nurses are knowledge workers providing care to individuals, families, and communities in the data/information intensive environment of modern healthcare. They are continually collecting data about their clients’ environment. The data are organized and processed, producing information about client needs and potential interventions. Using an extensive nursing knowledge database, the information is interpreted. Nurses then use their knowledge, judgment, and wisdom to develop a plan. The goal of this plan is to provide caring cost-effective quality care to individuals, families, groups, and communities.

In modern healthcare, the process of moving from data collection to implementing and evaluating an individualized plan of care is highly dependent on automated database systems. This chapter introduces the nurse to concepts, theories, models, and issues necessary to understand the effective use of automated database systems.

Defining data, databases, information, and information systems

ü Data are raw uninterrupted facts that are without meaning. For example, a patient’s weight is recorded as 140 lb, without additional information this fact or datum cannot be interpreted. The patient could be a young child who is overweight or an adult who is several pounds underweight. When data are interpreted, information is produced. While data are meaningless, information by definition is meaningful. For data to be interpreted and information produced, the data must be processed. This means that the data are organized so that patterns and relationships between the data can be identified. There are several approaches to organizing data. (sorting, classifying, summarizing, and calculating).

ü Database is an organized collection of related data. Examples are placing notes in folders in file cabinets. A common paper example is the phonebook. A much more complex example can be a patient’s medical record.

4 factors:

1. How the data are named (indexed) and organized

2. The size and complexity of the database

3. The type of data within the database

4. The methodology or tools used to search the database

ü Information systems are used to process data produce information. The term “information system often used to refer to computer systems, but this is one type of information system. There are manual information systems as well as human information systems. The most effective and complex information system is the human brain. People are constantly taking in and processing that data to produce meaning.

Types of data

When developing automated database systems, data element is defined. There are two primary approaches to classifying data in a database system. (1^st- they are classified in terms of how these data will be used; 2^nd- data are classified in their computerized data type.)

1. Computer- based data types

Alphanumeric include letters and numbers in any combination; however, the numbers in an alphanumeric field cannot form numeric function.

(example is the social security no.)

Numeric data are used to perform numeric functions including adding, subtracting, multiplying, and dividing. There are several different formats as well as types of numeric data. The number of digits after the decimal or the presence of commas in a number are examples of format options. Numeric data can be long integer, currency, or scientific.

Logic data are limited to two options. Some examples include Yes or No, True or False, 1 or 2, and On or Off.

2. Conceptual data types

Conceptual data types reflect how users view the data. These can be based on the source of the data. Conceptual data can also be based on the event that the data are attempting to capture. One of the major advantages of an automated information system is that each of these data elements can be captured once and used many times by different users for different purposes.

ü Database management systems

DBMS are computer programs used to input, store, modify, process, and access data in a database. Before a DBMS can be used, the DBM software must first be configured to manage the data specific to the project. This process of configuring the database software is called database system design. Once the software is configured for the project, the database software is used to enter the project data into the computer. A functioning DBMS consists of three interacting parts. These are the data, the DBMS configured software program, and the query language used to access the data. Some examples of DBMS in everyday life include computerized library systems, automated teller machines, and flight reservation systems. When these systems are being used, the data, the DBMS, and the query language interact together. As a result, it is easy to confuse one with the other.

ü Advantages of automated database management systems

Automated DBMSs decrease data redundancy, increase data consistency, and improve access to all data. These advantages result from the fact that in a well-designed automated system all data exist in only one place. The datum is never repeated.

Data redundancy occurs when the same data are stored in the database more than once. Making a copy of class notes to store the same notes in two different folders is an example.

ü Fields, records, and files

Examples:

ID	F-NAME	L-NAME	ADDRESS-1	ADDRESS-2	CITY	ST
01	Betty	Smith	SRU, School of Nursing	20 North St	Pgh	PA
02	Leslie	Brown	DBMS Institute	408 Same St	NY	NY
03	Dori	Jones	Party Place	5093 Butler St	Any	VA
04	Glenn	Clark	Univ of Study	987 Carriage Rd

ü Types of files

· Processing files- executable files consist of a computer program or set of instructions that, when executed, causes the computer to open or start a specific computer program or function. These are the files that tell a computer what actions the computer should perform when running a program.

Command files are a set of instructions that perform a set of functions as opposed to running a whole program.

Batch file contains a set of operating system commands.

· Data files

Data files contain data that have been captured and stored on a computer using a software program. Many times the extension for the file identifies the software program used to create the file.

The master index file contains the unique identifier and related indexes for all entities in the database. An example is the identification file for all patient records in a healthcare system.

ü Database models

A database system provides access to both the data in the database and to the interrelationship within and between the various data elements. Building a database begins by identifying these data elements and the relationships that exist between the data elements.

The American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) model has proven effective since the 1970s. The ANSI/SPARC model identifies three views or models of the data elements and their relationships. These 3 views are the users’ view, the logical view, and the physical view (Whitehorn, 2000).

The first model and the first step in building the database is to understand the data and the data relationships from the users perspective. This is referred to as the external or user model.

The users view is the wish list of requirements that the user will have for the database. It is the list of functional specifications describing the queries, reports, and procedures that can be produced by the database. The user model is then used as a guide for structuring the physical database within the computer. The common ground between the users view and the physical view is the conceptual model.

ü Conceptual models

A conceptual model includes a diagram and narrative description of the data elements, their attributes, and the relationships between the data. It defines the structure of the whole database in terms of the attributes entities (data elements) relationships, constraints, and operation.

ü Structural or physical data models

The physical data model includes each of the data elements and the relationship between the data elements, as they will be physically stored on the computer.

4 primary approaches to the development of a physical data model:

1. Hierarchical- hierarchical database have been compared to inverted trees. All access to data starts at the top of the hierarchy or at the root. The table at the root will have pointers called branches that will point to tables with data that relate hierarchically to the root. Each table is referred to as a node.

2. Network model- developed from hierarchical models. In a network model, the child node is not limited to one parent. This makes it possible for a network model to represent many- to- many relationships; however, the presence of multiple links between data does make it more difficult if data relationships change and redesign is needed.

3. Relational database models- consist of a series of files set up as tables. Each column represents an attribute, and each row is a record. Another name for a row is “tuple.” The intersection of the row and the column is a cell. The datum in the cell is the manifestation of the attribute for that record. Each cell may contain only one attribute. The datum must be atomic or broken down into its smallest format.

Table A

ID	L - NAME	F - NAME	SEX	B – DATE
12	Smith	Tom	M	01 – 23 – 73
14	Brown	Robert	M	02 – 01 – 77
13	Jones	Mary Lou	F	12 – 12 – 54
15	Yurick	Edward	M	04 – 04 - 38

Table B

ID	DX - 1	DX – 2	DX – 3	DX – 4
12	MI	CVA	GLACOMA	PVD
14	CVA	HEPATITIS C	COLITIS	UTI
13	DIABETES M	ANGINA	CVA	GOUT
15	CERF	AMENIA	GLACOMA	PEPTIC ULCER

Table C

ID	L - NAME	F - NAME	DX - 1	DX – 2	DX - 3
12	Smith	Tom		CVA
14	Brown	Robert	CVA
13	Jones	Mary Lou			CVA

4. Object – oriented model

An object- oriented database was developed because the relational model has a limited ability to deal with binary large objects or BLOBs. BLOBs are complex data types such as images, sounds, spreadsheets, or text messages. They are large nonatomic data with parts and subparts that are not easily represented in a rational database. In object- oriented databases the entity as well as attributes of the entity are stored with the object. An object can store other objects as well. In the object- oriented model, the data definition includes both the object and its attributes.

DATABASE LIFE CYCLE

The development and use of a DBMS follow a systematic process called the life cycle of a database system. The number of steps used to describe this process can vary from one author to another.

ü Initiation occurs when a need or problem is identified and the development of a DBMS is seen as a potential solution. This initial assessment looks at what is the need, what are the current approaches, and what are the potential options for dealing with the need.

PLANNING AND ANALYSIS

This step begins with an assessment of the users view and the development of the conceptual model. This includes the internal and external uses of information.

DETAILED SYSTEMS DESIGN

The DSD begins with the selection of the physical model: hierarchical, network, relational, or object – oriented. Using the physical model, each table and the relationships between the tables are developed. At this point, the data entry screens and the format for all output reports will be carefully designed. The users in the department must validate the data entry screens and output formats. It is often helpful to use prototypes and screen shoots to get user input during this stage. Revisions are to be expected.

ü Implementation

Implementation includes training the users, testing the system, developing a procedure manual for use of the system, piloting the DBMS, and finally “going live.” The procedure manual outlines the “rules” for how the system is used in day – to – day operations.

ü Evaluation and Maintenance

When a new database system has been installed, the developers and the users can be very anxious to immediately evaluate the system. Initial or early evaluations may have limited value. It will take a few weeks or even months for users to adjust their work routines to this new approach to information management. The first evaluations should be informal and focus more on troubleshooting specific problems. Once the system is up and running and users have adjusted to the new information processing procedure, they will have a whole new appreciation of the value of a DBMS. At this point, a number of requests for new options can be expected.

ü Common Database Operations

DBMSs vary from small programs running on a personal computer to massive programs that manage the data for large international enterprises. No matter what size or how a DBMS is used, there are common operations that are performed by all DBMSs. There are 3 basic types of data processing operations.

1. Data input

2. Data processing

3. Data output

ü DATA PROCESSING PROCESSES

These are DBMS- directed actions that the computer performs on the data once entered into the system. It is these processes that are used to convert raw data into meaningful information. In large databases these are processes referred to as online transaction processing (OLTP). OLTP are defined as real – time processing of transactions to support the day – to – day operation of the institution.

ü DATA OUTPUT OPERATIONS

This section includes online and written reports. The approach to designing these reports will have a major impact on what information the reader actually gains from the report. Reports that are clear and concise help the reader see the information in the data. On the other hand, poorly designed reports can mislead and confuse the reader.

2 important purposes:

1. Both the developers and the users create a new level of knowledge and skill.

2. As individual departments develop databases, institutional data are being created; however, if each department develops its individual database system, in isolation, islands of automation are then developed.

THE DEVELOPMENT OF DATA WAREHOUSES

Healthcare institutions have been automating their processes and developing databases since the mid – 1960s. In most institutions, this process began in two areas, the financial department and in department systems. Some of the oldest and most developed departmental systems are in the labs, radiology, medical records, and cardiac departments. Initially these systems developed as islands of automation that were focused on the operational needs of the individual department. The development of these systems and the interfaces between these systems were strongly influenced by the free – for – service approach to financing healthcare.

A data warehouse is defined as a large collection of data imported from several different systems within one database. The source of the data includes not only internal data from the institution but can also include data from external source.

Bill Immon, the father of the data warehouse concept, defined a data warehouse as a subject – oriented, integrated, time variant, non-volatile collection of data used to support the management decision – making process (Lambert, 1999)

PURPOSES OF A DATA WAREHOUSE

The development of a data warehouse requires a great deal of time, energy, and money. An organization’s decision to develop a data warehouse is based on several goals and purposes. Because of its integrated nature a data warehouse spares users from the need to learn several different applications.

Functions of data warehouse:

The management of a data warehouse requires three types of programs. 1^st, the data warehouse must be able to extract data from the various computer systems and import that data into the data warehouse. 2^nd, the data warehouse must function as a database able to store and process all of the data in the database. This includes the ability to aggregate the data and process the aggregated data. 3^rd, the data warehouse must be able to deliver the data in the warehouse back to the users in the form of information.

Data from a data warehouse can be used to support a number of activities including (AHIMA, 1998):

1. Decision support for caregivers at the point of care

2. Outcome measurements and quality improvement

3. Clinical research and professional education

4. Reporting to external agencies, e.g., Joint Commission on Accreditation of Health Care Organizations

5. Market trend analysis and strategic planning

6. Health services management and process reengineering

7. Targeted outreach to patients, professionals, and other community groups

Quality of the data

In a data warehouse, data are entered once but used by many users for a number of different purposes. As a result, the quality of the data takes on a whole new level of importance. In addition, the concept of data, ownership changes. When dealing with a department information system, the department is usually seen as owning the data and being responsible for the quality of that data…………

Data to Knowledge (D2K)

The process of extracting information and knowledge from from large – scale databases has been referred to as knowledge discovery and data mining (KDD) or D2K applications. AGL used this approach to coin the term Image to Knowledge (I2K) when referring to the mining of imaging data. While some authors use the term data mining and D2K interchangeably, others consider data mining one step in the D2K process. D2K uses powerful automated approaches for the extraction of hidden predictive information from large databases.

Data Mining Process

Approach	Description	Examples of Methods
Predicting	Discovering variables that predict or classify a future event	Decision tree Neural networks
Discovery	Discovering patterns, associations, or clusters within a large dataset	Apriori fractionalization
Deviation	Discover the norm via pattern recognition and then discover deviations from this norm	Scatterplots Parallel coordinates

The Nursing Context

Data Processing

Sunday, May 4, 2014

Blog Archive