Saturday, April 25, 2009

Database Normalization

I. Process Design Database (review)

In Process Design Database there are:
  1. Collect requirement of user / business
  2. Develop E-R Model based requirement of user / business
  3. Convert E-R Model to relationship gathering ( tables)
  4. Relationship normalization to eliminated or removed anomaly
  5. Implementation goes to database by make table for each relationship already most normalization

II. Database Normalization

In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
Normalization phase started from lightest phase ( 1NF) until tightest ( 5NF). Usually only coming up with level 3NF or BCNF because have adequate enough to yield the tables of which with quality goodness.

Why done by normalization?
  • Optimizing table structures.
  • Improve speed.
  • Remove same data.
  • More efficient in usage of storage media.
  • Lessen redundancy.
  • Avoid anomalies ( anomalies insertion, anomalies deletion, update anomalies).
  • Improved Data integrity.

III. Functional Dependency

A functional dependency (FD) is a constraint between two sets of attributes in a relation from a database.
Given a relation R, a set of attributes X in R is said to functionally determine another attribute Y, also in R, (written X --> Y) if and only if each X value is associated with precisely one Y value. Customarily we call X the determinant set and Y the dependent attribute. Thus, given a tuple and the values of the attributes in X, one can determine the corresponding value of the Y attribute. For the purposes of simplicity, given that X and Y are sets of attributes in R, X --> Y denotes that X functionally determines each of the members of Y - in this case Y is known as the dependent set. Thus, a candidate key is a minimal set of attributes that functionally determine all of the attributes in a relation.

Example :

Functional Dependency:
  • NRP--> Nama
  • Mata_Kuliah, NRP --> Nilai

Non Functional Dependency:
  • Mata_Kuliah --> NRP
  • NRP --> Nilai

Functional Dependency from tables of Nilai:

Nrp --> Nama
Because to each Nrp value is same, so Name value is same too.

{Mata_Kuliah, NRP} --> Nilai
Because attribut value depend on Mata_Kuliah and NRP by together. In other meaning for the Mata_Kuliah and NRP is same, so have same Nilai too, because Mata_Kuliah and NRP is key ( having the character is unique).

Mata_Kuliah --> NRP
NRP --> nilai

IV. First Normal Form / 1NF

First normal form (1NF or Minimal Form) is a normal form used in database normalization. A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria. These criteria are basically concerned with ensuring that the table is a faithful representation of a relation and that it is free of repeating groups.
The concept of a "repeating group" is, however, understood in different ways by different theorists. As a consequence, there is no universal agreement as to which features would disqualify a table from being in 1NF.

Not allowed for :
  • Attribute that valuably there are many ( Multi valued attribute )
  • Attribute composite or combine of both

So :
  • Price of Domain attribute must represent atomic price

For Example
Table "Mahasiswa"


That Tables is not complete 1NF condition.
Decomposition becomes:

Table mahasiswa :

Table hoby :

V. Second Normal Form / 2NF

Second normal form (2NF) is a normal form used in database normalization. 2NF was originally defined by E.F. Codd in 1971. A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. Specifically: a 1NF table is in 2NF if and only if, given any candidate key and any attribute that is not a constituent of a candidate key, the non-key attribute depends upon the whole of the candidate key rather than just a part of it.
In slightly more formal terms: a 1NF table is in 2NF if and only if none of its non-prime attributes are functionally dependent on a part (proper subset) of a candidate key. (A non-prime attribute is one that does not belong to any candidate key.)

Functional dependency X --> Y is said full if erase an attribute A of X its mean that Y no longer functional dependent. Functional dependency X --> Y is said partial if erases an A's attribute of X matter Y still functional dependent. Relationship schemer on 2NF form if each attribute non primary key A R full dependent functionally on primary key R.

This table accomplishes 1NF, but not exclude 2NF :

That table is Not accomplishing 2NF form, because (NIM, KodeMk) is regarded as the primary key:

{NIM, KodeMk} -> NamaMhs
{NIM, KodeMk} -> Alamat
{NIM, KodeMk} -> Matakuliah
{NIM, KodeMk} -> SKS
{NIM, KodeMk} -> NilaiHuruf

Table need to decomposition become some table measures 2NF

Functional dependency as follows:
  • {NIM, KodeMk} --> NilaiHuruf (fd1)
  • NIM --> {NamaMhs, Alamat} (fd2)
  • KodeMk --> {Matakuliah, Sks} (fd3)

  • fd1 (NIM, KodeMk, NilaiHuruf) --> Tabel Nilai
  • fd2 (NIM, NamaMhs, Alamat) --> Tabel Mahasiswa
  • fd3 (KodeMk, Matakuliah, Sks) --> Tabel MataKuliah

VI. Third Normal Form

The third normal form (3NF) is a normal form used in database normalization. 3NF was originally defined by E.F. Codd in 1971. Codd's definition states that a table is in 3NF if and only if both of the following conditions hold:
  1. The relation R (table) is in second normal form (2NF)
  2. Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every key of R.

An example of a 2NF table that fails to meet the requirements of 3NF is:

Because still there are attribute not primary key (namely Kota and Provinsi) own depended to attribute not other primary key (name is Kodepos) :
  • Kodepos --> {Town, Provinsi }
So that the table require to decomposition become :
  • Mahasiswa ( NIM, Namamhs, Jalan, Kodepos)
  • Kodepos ( Kodepos, Provinsi, Town)

VII. Boyce-Codd Normal Form / BCNF

Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X --> Y, X is a superkey—that is, X is either a candidate key or a superset thereof.

VIII. Fourth and Fifth Normal Form

Relationship in fourth normal form (4NF) if relationship in BCNF and not contains multi value dependency. To remove multi value dependency from one relationship, we are divide relationship become two new relationships. Each relationship contains of two attributes have multi value relationship.

Relationship in fifth normal form (5NF) get business with property is calling join without marks sense information loss (lossless join). The fifth normal Form (5 NF) also know as PJNF (projection join normal form). This case is very rare to appearance and hard to detect practically.


Saturday, April 18, 2009

Database and Entity Relationship Diagram

I. Database

The definition of a database is a structured collection of records or data that is stored in a computer system. In order for a database to be truly functional, it must not only store large amounts of records well, but be accessed easily. In addition, new information and changes should also be fairly easy to input.
Database can meaning a collection of interconnection data for a enterprise (company, government institution or private sector).

II. Database Management System (DBMS)

A database management system (DBMS) is computer software that manages databases. DBMSes may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way.
Example of DBMS :
  • MySQL
  • PostgreSQL
  • Oracle

III. Bit, Byte and Field
  • A bit is a binary digit, taking a value of either 0 or 1. Binary digits are a basic unit of information storage and communication in digital computing and digital information theory.
  • A byte is a basic unit of measurement of information storage in computer science. In many computer architectures it is a unit of memory addressing. There is no standard but a byte most often consists of eight bits.
  • Field is a group of same byte, in database used by attribute.

IV. Attribute / Field

Attribute or Field is a characteristic from entity, which preparing detailed explanation about that's entity.
A relation could have atribute too.
Example of Attribute :
  • Student : NIM, Name, Sex, Address
  • Car : Plat Number, Color, CC
  • Book : ID, title, author

Type of Attribute

Single value Vs Multi value Attributes
  1. Single Value : only can fill at most one value
  2. Multi Value : can fill with interest from one value with same type
Composite Vs Atomic Attributes
  1. Composite : cluster from some smaller attribute. For example: Name ( First name, middle name, and last name)
  2. Atomic : can not divide into smaller attribute. For example: ID
Derived Attribute : Value of attribute result from other attribute value. For example: age from attribute date of birthday

Null Value Attribute : Attribute that doesn’t have any value for a record

Mandatory Value Attribute : Attribute must have value

V. Record / Tuple
Record is a data line in an relation. Consist of attributes where there attribute can interaction to completely information a entity / relation.

VI. Entity / File
File is a group of same record and have same element, same attribute but different of each data value.
In application process, file can be categorized as :
  • Main file
  • Transaction file
  • Report file
  • History file
  • Protector file
  • Activity file

VII. Domain

Domain is collection of value that can be inside one or more attribute. Each attribute in database relation defined as domain.

VIII. Data Element Key

Key is record's element that is used to find record that on access time or can also be utilized to identify each entity / record / the line.

Types of Key
  • Super Key : superkey is defined in the relational model of database organization as a set of attributes of a relation variable (relvar) for which it holds that in all relations assigned to that variable there are no two distinct tuples (rows) that have the same values for the attributes in this set. Equivalently a superkey can also be defined as a set of attributes of a relvar upon which all attributes of the relvar are functionally dependent.
  • Candidate Key : candidate key is a minimal set of columns necessary to identify a row, this is also called a minimal superkey.
  • Primary Key : primary key is a candidate key to uniquely identify each row in a table. A unique key or primary key comprises a single column or set of columns. One of attribute from candidate keys to get is chosen / is determined as primary keys with three criterions as follows:
  1. that key is more natural to been utilized as basis
  2. that key is simpler
  3. that key its uniqueness indemnity bond
  • Alternate Key : Alternate keys is attribute from candidate keys that don't be chosen becomes primary key.
  • Foreign Key : foreign key is a referential constraint between two tables.The foreign key identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table. The columns in the referencing table must be the primary key or other candidate key in the referenced table.
  • External Key : External keys to constitute a lexical attribute (or lexical gathering attribute) one that its point do ever identify one object instance.

IX. Entity Relationship Diagram

ERD is a model of a network that uses word order is stored in the abstract system.
The difference among DFD and ERD :
  • DFD is a model of network functions that will be implemented by the system
  • ERD is a model that emphasizes the network data on the structure and relationship data
Elements Of ERD
  • Entity
In the ER Diagram Entity is described with the form of a rectangle. Entity is something that exists in the real system and the abstract where the data stored or where there are data.
  • Relationship
A relationship is some association between entities. In this section we shall concentrate on binary relationships. That is, associations between two entities. In section 16.4 we shall introduce other N-ary relationships. That is, relationships between one, three, four or N entities. In the E–R approach, more than one relationship can exist between any two entities.
  • Relationship Degree
Are total entity who participated in one relationship. Degree that frequent being used on ERD
  • Attribute
Are character or characteristic of every entity and also relationship.
  • Cardinality
Pointing out record optimum that relations can with entity on entity another one.

X. Relationship Degree
  • Unary Relationship is a relationship model between entity coming from same entity set.
  • Binary Relationship is a relationship model between 2 entity.
  • Ternary Relationship is a relationship between instance from 3 type of entitas unilaterally.

XI. Cardinality

There are 3 cardinalities relations :
  • One to One: Level one to one relationship with the one stated in the entity's first event, only had one relationship with one incident in which the two entities and vice versa.
  • One to Many or Many to One: Level one to many relationship is the same as the one to many depending on the direction from which the relationship seen .for an incident on the first entity can have any relationship with more incident on the second entity, if the one incident on the entity the second can only have one relation with the incident on the first entity.
  • Many To Many: if any incident occurs in an entities have relationships with other entities in the incident.


Saturday, April 4, 2009

Data Flow Diagram

1. Data Flow Diagram
  • Used to perform structured analysis to determine logical requirements
  • A graphical tool, useful for communicating with users, managers, and other IS personnel
  • Useful for analyzing existing as well as proposed systems
  • A relatively simple technique to learn and use
2. Context Diagram

Defines the scope of the system by identifying the system boundary.
  • One process (which represents the entire system)
  • All sources/sinks (external entities)
  • Data flows linking the process to the sources and sinks (external entities)
  • The diagram does not contain any data stores.
  • All external entities are shown on the context diagram as well as major data flow to and from them.
Example of Context Diagram :

3. Zero Diagram
  • Describes the overall processing of the system
  • Show one process for each major processing step or functional requirement
  • Data flows from the context appear on system diagram also (level balancing)
  • Can show a single data store to represent all data in aggregate at this level
  • Can draw duplicate sources, sinks and data stores to increase legibility
Example of Zero Diagram :

4. Detail Diagram
  • Detail Diagram is the explosion of the context level diagram.
  • It should include up to 7 or 9 processes.
  • Processes are numbered with an integer.
  • The major data stores and all external entities are included on Diagram 0.
Numbering Level On DFD :

5. Spesification of Process

Each process in DFD must have the process specification. in top level method used to describe process by using descriptive sentence. At more level detailed that is under process (functional primitive) required the specification structure.Process Specification will become guide to programmer in coding. Method used in the process specification : description process in the story, decision table and decision tree.

6. External Entity
  • Any class of people, an organization, or another system which exists outside the system you are studying.
  • Form the boundaries of the system.
  • The system and external entities exchange data in the form of data flows.
  • Must be named, titles preferred to names of individuals - use a noun
7. Data Flow

  • Data flow shows the data about a person, place, or thing that moves through the system.
  • Names should be a noun that describes the data moving through the system.
  • Arrowhead indicates the flow direction.
  • Use double headed-arrows only when a process is reading data and updating the data on the same table or file.
Other rule:

* Data flow name which turn in at a process may not same by the name of secretary data flow of that process.
* Data flow into or out of data storage doesn't need to give a name if:
o The flow of data simple and easy to understand
o Describes the data flow of all data items
* There may not any data flow of terminal to data storage or on the contrary because terminal isn't part of system, terminal relation with data storage have to pass process.

8. Process

  • Transforms incoming data flows into outgoing dataflows
  • Each process has a unique number and name
  • Name each process using a verb and a noun phrase
  • the name of a process should describe what the process does
  • avoid vague names where possible
  • represent with a bubble or rounded square

9. Data Storage

  • A data store represents a collection of data flows "at rest"
  • Each data store has a unique name
  • The name should describe the contents of the data store
  • A data store may represent many different types of physical locations of data
  • A data store may be a temporary or a permanent repository of data

10. DFD Symbol

11. Data Dictionary

Data dictionary is a reserved space within a database which is used to store information about the database itself. Data dictionary is also called with a system data dictionary is a catalog of facts and data information needs of an information system. In function to help system agent to interpreting application in detail and organization all of data element that utilized by system exactly so user and system analyst have same understanding basic about entry, output, storage and process. In analysis phase, data dictionary used as communication between system analyst with user. in development system phase, data dictionary used to design input, and report database. Data flow in DFD have the character of globally, boldness more detailed can be seen in data dictionary.
Data dictionary load the followings :

  • Name of data current: must note that readers who need further explanation about a flow of data can find it easily
  • Alias: alias or other name of the data can be written when there is
  • Forms of data: used to segment the data dictionary to use when designing the system
  • Flow data: indicates from which data flows and where the data
  • Description: to give an explanation of the meaning of the data flow

12. Balancing In DFD

Data flow into and out of a process must be the same as the flow of data into and out of the details of the process on the level / levels below. Name of the data flow into and out of the process must match the name of the flow of data into and out of the details of the process. Number and the name of an entity outside the process must be equal to the number of names and entities outside of the details of the process.
Things which must be gave attention to DFD owning than one level:

  • There are must input balance and output between one level and level next
  • Balance between level 0 and level 1 seen at input / output of data stream to or from terminal in level 0, while balance between level 1 and level 2 seen at input / output of data stream to/from pertinent process
  • Data flow name, data of storage terminal and every level must be same if its same object

13. Prohibition In DFD
  • The data flow may not directly from one external entity to another external entity without any process.
  • The data flow may not directly from data storage to external entity without any process.
  • The data flow may not directly from data storage to the other data storage without any process.
  • The data flow from one process directly go to another process without any data storage should be avoided.