HeLp & TeAcH: DBMS

Database management system (DBMS)

A database management system (DBMS), sometimes just called a database manager, is a program that lets one or more computer users create and access data in a database. The DBMS manages user requests (and requests from other programs) so that users and other programs are free from having to understand where the data is physically located on storage media and, in a multi-user system, who else may also be accessing the data. In handling user requests, the DBMS ensures the integrity of the data (that is, making sure it continues to be accessible and is consistently organized as intended) and security (making sure only those with access privileges can access the data). The most typical DBMS is a relational database management system (RDBMS). A standard user and program interface is the Structured Query Language (SQL). A newer kind of DBMS is the object-oriented database management system (ODBMS).

A DBMS can be thought of as a file manager that manages data in databases rather thanfiles in file systems. In IBM's mainframe operating systems, the nonrelational data managers were (and are, because these legacy application systems are still used) known asaccess methods.

A DBMS is usually an inherent part of a database product. On PCs, Microsoft Access is a popular example of a single- or small-group user DBMS. Microsoft's SQL Server is an example of a DBMS that serves database requests from multiple (client) users. Other popular DBMSs (these are all RDBMSs, by the way) are IBM's DB2, Oracle's line of database management products, and Sybase's products.

IBM's Information Management System (IMS) was one of the first DBMSs. A DBMS may be used by or combined with transaction managers, such as IBM's Customer Information Control System (CICS).

What is Database

A database is a collection of information that is organized so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.

In computing, databases are sometimes classified according to their organizational approach. The most prevalent approach is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.

Computer databases typically contain aggregations of data records or files, such as sales transactions, product catalogs and inventories, and customer profiles. Typically, a database manager provides users the capabilities of controlling read/write access, specifying report generation, and analyzing usage. Databases and database managers are prevalent in largemainframe systems, but are also present in smaller distributed workstation and mid-range systems such as the AS/400 and on personal computers. SQL (Structured Query Language) is a standard language for making interactive queries from and updating a database such as IBM's DB2, Microsoft's SQL Server, and database products from Oracle, Sybase, and Computer Associates.

Advantages of the Database Approach

· == Data Independence ==

o The data is held in such a way that changes to the structure of the database do not effect any of the programs used to access the data.

· == Consistency of Data ==

o Each item of data is held only once therefore no danger of item being updated on one system and not on another.

· == Control Over Redundancy ==

o In a non-database system, the same information may be held on several files. This wastes space and makes updating more time-consuming. A database system minimises these effects.

o *** == Integrity of Data ==

§ **** The DBMS provides users with the ability to specify constraints on data such as making a field entry essential or using a validation routine.

§ == Greater Security of Data ==

§ The DBMS can ensure only authorised users are allowed access to the data.

§ == Centralised Control of Data ==

§ The Database Administrator will control who has access to what and will structure the database with the needs of the

§ == More Information Available to Users ==

§ Users have access to a wider range of data that was previously held in separate departments andsometimes on incompatible systems.

§ == Increased Productivity ==

§ The DBMS provides an easy to use query language that allows users to get immediate response from their queries rather than having to use a specialist "programmer" to write queries for them. whole department in mind

o = Disdvantages of the Database Approach =

§ == Larger Size ==

§ More disk space is required and probably a larger and more powerful computer.

§ == Greater Complexity ==

§ For optimum use the database must be very carefully designed. If not done well, the new system may fail to satisfy anyone.

§ == Greater Impact of System Failure ==

§ "All eggs in one basket."

§ == More Complex Recovery Procedures ==

§ If a system failure occurs it is vital that no data is lost.

Data abstraction:

It can be summed up as follows.
1. When the DBMS hides certain details of how data is stored and maintained, it provides what is called as the abstract view of data.
2. This is to simplify user-interaction with the system.
3. Complexity (of data and data structure) is hidden from users through several levels of abstraction.

Data abstraction is used for following purposes:

1. To provide abstract view of data.
2. To hide complexity from user.
3. To simplify user interaction with DBMS.

The Three Levels of Abstraction

There are three levels of data abstraction.

1. Physical level: It describes how a record (e.g., customer) is stored.
Features:
a) Lowest level of abstraction.
b) It describes how data are actually stored.
c) It describes low-level complex data structures in detail.
d) At this level, efficient algorithms to access data are defined.

2. Logical level: It describes what data stored in database, and the relationships among the data.
Features:
a) It is next-higher level of abstraction. Here whole Database is divided into small simple structures.
b) Users at this level need not be aware of the physical-level complexity used to implement the simple structures.
c) Here the aim is ease of use.
d) Generally, database administrators (DBAs) work at logical level of abstraction.

3. View level: Application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
Features:
a) It is the highest level of abstraction.
b) It describes only a part of the whole Database for particular group of users.
c) This view hides all complexity.
d) It exists only to simplify user interaction with system.
e) The system may provide many views for the whole system.

Data Independence in DBMS:

Data independence :- Data independence is one of the main advantages of DBMS. The ability to modify a schema(structure of database) definition in one level without offering a schema definition in the next higher level is called a data independence.

There are two types of data independence :-

1) Physical data Independence :- Physical data Independence is the ability to modify the physical schema without cauring affect of the logical level. Or in other words we can say that the old programs are not be changed when changes occurred in the physical story structure in which data are stored.

2) Logical data Independence :- Logical data Independence is the modify the conceptual scheme without causing offers of the view level. Or in other words we do not need to the security the programs

What are the benefits of Database Approach?

The benefits of database approach is to
1. reduce redundancy of information
Data’s are being made simple and complex, we don’t have to input as many data’s which can only result in overloaded space in computer hard disk/ memory.

2. Consistent data flow
Once the data’s are being analyzed in sequence, once can easily identify the flow of information and hence, results will come in reliable output.

3. Integration of data
Incorporating of data’s in tables should result in a highly assimilation of information.

4. Security and User privileges
Protection of data’s from unwanted users and giving user’s rights to what level they are to use the application.

5. Ease of application development.
It is a more comfortable workspace for the relevance of its factual purpose.

Data independence is a form of database management that keeps data separated from all programs that make use of the data. As a cornerstone for the idea of a DBMS or databasemanagement system, data independence ensures that the data cannot be redefined or reorganized by any of the programs that make use of the data. In this manner, the data remains accessible, but is also stable and cannot be corrupted by the applications using it.

Database management within the environment of a centralized DBMS relies on the process ofdata independence. While data transparency still exists as far as the ability of different applications to access and use the data for completing tasks, no program or application can read and then begin to make changes to the data itself. The consistency of the data makes the overall process of maintaining and managing a single database or multiple databases within a single environment much easier.

The concept of data independence is different from another approach referred to as datadependence. With this approach, the data used to run an application is actually represented in the source code for the program. The ability of the data to be modified by the program, and vice versa, means that the data is not considered constant and may be changed either by initiating the change through modifications to the source code or through using the program in other ways to adapt the core data.

Data Abstraction: - Purpose of database users with an abstract view of the data that is system hides certain details of how the data are stored and maintained. It gives an architecture is to separate the user applications and the physical database. It defines in the following three levels:-

1) Internal level or physical level: - The represent data as they would be actually organized on physical storage media. The internal schema uses a physical data model and describes the complete detail of data storage and access path for the data base. It is the lowest and complex level.

2) Conceptual level or logical level: - Logical levels hide the detail of physical storage and describe about the entities data and constraints. It is the middle level and it is used by the application programmer it describes what data are stored in the database and what relationship among those data. All logical tasks are performed at this level.

3) View level or external level: - View level representation of data as they would appear to an application programmer or end user. It is used only by the naïve user. Only a required portion of data is viewed.

Etity-relationship model (ERM or ER model):

The entity-relationship model (or ER model) is a way of graphically representing the logical relationships of entities (or objects) in order to create a database. The ER model was first proposed by Peter Pin-Shan Chen of Massachusetts Institute of Technology (MIT) in the 1970s.

In ER modeling, the structure for a database is portrayed as a diagram, called an entity-relationship diagram (or ER diagram), that resembles the graphical breakdown of a sentence into its grammatical parts. Entities are rendered as points, polygons, circles, or ovals. Relationships are portrayed as lines connecting the points, polygons, circles, or ovals. Any ER diagram has an equivalent relational table, and any relational table has an equivalent ER diagram. ER diagramming is an invaluable aid to engineers in the design, optimization, and debugging of database programs.

In a logical sense, entities are the equivalent of grammatical nouns, such as employees, departments, products, or networks. An entity can be defined by means of its properties, called attributes. Relationships are the equivalent of verbs or associations, such as the act of purchasing, the act of repairing, being a member of a group, or being a supervisor of a department. A relationship can be defined according to the number of entities associated with it, known as the degree.

3NF vs BCNF:

Normalization is a process that is carried out to minimize the redundancies that are present in data in relational databases. This process will mainly divide large tables in to smaller tables with fewer redundancies. These smaller tables will be related to each other through well defined relationships. In a well normalized database, any alteration or modification in data will require modifying only a single table. Third Normal Form (3NF) was introduced in 1971 by Edgar F. Codd, who is also the inventor of the relational model and the concept of normalization. Boyce-Codd Normal Form (BCNF) was introduced in 1974 by Codd and Raymond F. Boyce.

What is 3NF?

3NF is the Third normal form used in relational database normalization. According to the Codd’s definition, a table is said to be in 3NF, if and only if, that table is in the second normal form (2NF), and every attribute in the table that do not belong to a candidate key should directly depend on every candidate key of that table. In 1982 Carlo Zaniolo produced a differently expressed definition for 3NF. Tables that comply with the 3NF generally do not contain anomalies that occur when inserting, deleting or updating records in the table.

What is BCNF?

BCNF (also known as 3.5NF) is another normal form used in relational database normalization. It was introduced to capture some the anomalies that are not addressed by the 3NF. A table is said to be in BCNF, if and only if, for each of the dependencies of the form A → B that are non-trivial, A is a super-key. Decomposing a table that is not in the BCNF normal form does not guarantee the production of tables in the BCNF form (while preserving the dependencies which were present in the original table).

What is the difference between 3NF and BCNF?

Both 3NF and BCNF are normal forms that are used in relational databases to minimize redundancies in tables. In a table that is in the BCNF normal form, for every non-trivial functional dependency of the form A → B, A is a super-key whereas, a table that complies with 3NF should be in the 2NF, and every non-prime attribute should directly depend on every candidate key of that table. BCNF is considered as a stronger normal form than the 3NF and it was developed to capture some of the anomalies that could not be captured by 3NF. Obtaining a table that complies with the BCNF form will require decomposing a table that is in the 3NF. This decomposition will result in additional join operations (or Cartesian products) when executing queries. This will increase the computational time. On the other hand, the tables that comply with BCNF would have fewer redundancies than tables that only comply with 3NF. Furthermore, most of the time, it is possible to obtain a table that comply with 3NF without hindering dependency preservation and lossless joining. But this is not always possible with BCNF.

Domain constraints:

Domain constraints are a user-defined data type which enforces the integrity of the standard data types. A domain consists of all values permitted in a column. In constructing your database you would first select a standard data type such as VARCHAR or INT. You could then constrain or limit the data using NOT NULL, UNIQUE, CHECK, PRIMARY KEY or FOREIGN KEY. For example the CHECK constraint could limit INT data within your domain [column] to numbers less than 5000. You would thus define a custom data type using domain constraints.

A domain is defined as the set of all unique values permitted for an attribute. For example, a domain of date is the set of all possible valid dates, a domain of integer is all possible whole numbers, a domain of day-of-week is Monday, Tuesday ... Sunday.

This in effect is defining rules for a particular attribute. If it is determined that an attribute is a date then it should be implemented in the database to prevent invalid dates being entered.

A classic example of this is where the data from a legacy system is loaded into a newly designed database. The new system is well designed. Columns that hold dates are defined as such whereas, in the old system, they were held as character strings. Much data is rejected because of invalid dates, eg 30 February 2000.

If the system supports domain constraints then this invalid data would not have stored in the first place. That is, the integrity of the database is being preserved.

RELATIONAL ALGEBRA:

It’s a theoretical language which operations work on one or more relations to define another

relation without changing the original relation...

FACILITIES: (operations) of relational algebra:

1. Selection(restriction). Selection of rows E,g salary>100(staff).those employ who have

2. salary more than 1000in salary table ..O _condition(R) e.g O _salary>100(Staff)

3. Projection: select columns from a relation. ∏ _{col1,…….coln}(R)…… ∏ _{sno,fname,address}

4. (Student)

5. Cartesian Product: concatinate rows of one relation to other relation’s record. R * S.. (Sid,name)(strudent)*(cid)(courses)..sid , name are the record of student relations &

6. cid is the record of course table

7. Union: to join two relations…..R ⋃S

union compatible..>data type must be same .

No duplication: no duplicate rows.

1. Difference: tuples which are in R but not n S . R - S ..wat r the members which are

2. in one society but not n another society.

3. Joins Operators: joining of two relations. Join operators are

o Theta Join :matching of two field in two relation. It include =,<,><=,=!.....R_⋈_condition S

o Equi join: Teta is general form so for equal it s equal join, for the greater it s greater

o join.

o Natural join: if there are two columns that will become one.

o Outer join: those which are matching & those which are not matching that is complete

o relation. ⊳⊲

o Semi join: DDB(distributd DB) take relations from one side & then relations from other

o side. R_⊳_f S…where f s condition or predicate. protocol

4. Intersection: tuples that r n both R & S & must be union-compatible. R ⋂S

5. Division : R ∻S

To more info and for pdf file visit our BOOKS page or Click here

HeLp & TeAcH

Pages

DBMS

No comments:

Post a Comment