Database management
system (DBMS)
A database management
system (DBMS), sometimes just called a database manager, is a
program that lets one or more computer users create and
access data in a database. The DBMS manages user requests (and
requests from other programs) so that users and other programs are free from
having to understand where the data is physically located on storage media and,
in a multi-user system, who else may also be accessing the data. In handling
user requests, the DBMS ensures the integrity of the data
(that is, making sure it continues to be accessible and is consistently
organized as intended) and security (making sure only those
with access privileges can access the data). The most typical DBMS is
a relational database management system (RDBMS). A standard user and program
interface is the Structured Query Language (SQL). A newer kind of DBMS is the
object-oriented database management system (ODBMS).
A DBMS can be thought of
as a file manager that manages data in databases rather thanfiles
in file systems. In IBM's mainframe operating systems, the
nonrelational data managers were (and are, because these legacy
application systems are still used) known asaccess methods.
A DBMS is usually an
inherent part of a database product. On PCs, Microsoft Access is a popular
example of a single- or small-group user DBMS. Microsoft's SQL
Server is an example of a DBMS that serves database requests from multiple
(client) users. Other popular DBMSs (these are all RDBMSs, by the way) are
IBM's DB2, Oracle's line of database management products,
and Sybase's products.
IBM's Information
Management System (IMS) was one of the first DBMSs. A DBMS may be used by or
combined with transaction managers, such as IBM's Customer
Information Control System (CICS).
What is Database
A
database is a collection of information that is organized so that it
can easily be accessed, managed, and updated. In one view, databases can be
classified according to types of content: bibliographic, full-text, numeric,
and images.
In
computing, databases are sometimes classified according to their organizational
approach. The most prevalent approach is the relational database, a
tabular database in which data is defined so that it can be reorganized and
accessed in a number of different ways. A distributed database is one that can
be dispersed or replicated among different points in a network.
An object-oriented programming database is one that is congruent with
the data defined in object classes and subclasses.
Computer
databases typically contain aggregations of data records or files, such as
sales transactions, product catalogs and inventories, and customer profiles.
Typically, a database manager provides users the capabilities of controlling
read/write access, specifying report generation, and analyzing usage. Databases
and database managers are prevalent in largemainframe systems, but are
also present in smaller distributed workstation and mid-range systems
such as the AS/400 and on personal computers. SQL (Structured Query
Language) is a standard language for making interactive queries from and
updating a database such as IBM's DB2, Microsoft's SQL Server, and
database products from Oracle, Sybase, and Computer Associates.
· ==
Data Independence ==
o The data is held in such a way that
changes to the structure of the database do not effect any of the programs used
to access the data.
· ==
Consistency of Data ==
o Each item of data is held only once
therefore no danger of item being updated on one system and not on another.
· ==
Control Over Redundancy ==
o In a non-database system, the same
information may be held on several files. This wastes space and makes updating
more time-consuming. A database system minimises these effects.
o *** == Integrity of Data ==
§ **** The DBMS provides users with the ability to
specify constraints on data such as making a field entry essential or using a
validation routine.
§ == Greater Security of Data ==
§ The DBMS can ensure only authorised users are
allowed access to the data.
§ == Centralised Control of Data ==
§ The Database Administrator will control who has
access to what and will structure the database with the needs of the
§ == More Information Available to Users ==
§ Users have access to a wider range of data that
was previously held in separate departments andsometimes on incompatible
systems.
§ == Increased Productivity ==
§ The DBMS provides an easy to use query language
that allows users to get immediate response from their queries rather than
having to use a specialist "programmer" to write queries for them.
whole department in mind
o = Disdvantages of the Database
Approach =
§ == Larger Size ==
§ More disk space is required and probably a larger
and more powerful computer.
§ == Greater Complexity ==
§ For optimum use the database must be very
carefully designed. If not done well, the new system may fail to satisfy
anyone.
§ == Greater Impact of System Failure ==
§ "All eggs in one basket."
§ == More Complex Recovery Procedures ==
§ If a system failure occurs it is vital that no
data is lost.
Data
abstraction:
It can be summed up as
follows.
1. When the DBMS hides certain details of how data is stored and maintained, it provides what is called as the abstract view of data.
2. This is to simplify user-interaction with the system.
3. Complexity (of data and data structure) is hidden from users through several levels of abstraction.
1. When the DBMS hides certain details of how data is stored and maintained, it provides what is called as the abstract view of data.
2. This is to simplify user-interaction with the system.
3. Complexity (of data and data structure) is hidden from users through several levels of abstraction.
Data abstraction is used for
following purposes:
1. To
provide abstract view of data.
2. To hide complexity from user.
3. To simplify user interaction with DBMS.
2. To hide complexity from user.
3. To simplify user interaction with DBMS.
The Three Levels of Abstraction
There
are three levels of data abstraction.
1. Physical
level: It describes how a record (e.g., customer) is stored.
Features:
a) Lowest level of abstraction.
b) It describes how data are actually stored.
c) It describes low-level complex data structures in detail.
d) At this level, efficient algorithms to access data are defined.
2. Logical level: It describes what data stored in database, and the relationships among the data.
Features:
a) It is next-higher level of abstraction. Here whole Database is divided into small simple structures.
b) Users at this level need not be aware of the physical-level complexity used to implement the simple structures.
c) Here the aim is ease of use.
d) Generally, database administrators (DBAs) work at logical level of abstraction.
3. View level: Application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
Features:
a) It is the highest level of abstraction.
b) It describes only a part of the whole Database for particular group of users.
c) This view hides all complexity.
d) It exists only to simplify user interaction with system.
e) The system may provide many views for the whole system.
Features:
a) Lowest level of abstraction.
b) It describes how data are actually stored.
c) It describes low-level complex data structures in detail.
d) At this level, efficient algorithms to access data are defined.
2. Logical level: It describes what data stored in database, and the relationships among the data.
Features:
a) It is next-higher level of abstraction. Here whole Database is divided into small simple structures.
b) Users at this level need not be aware of the physical-level complexity used to implement the simple structures.
c) Here the aim is ease of use.
d) Generally, database administrators (DBAs) work at logical level of abstraction.
3. View level: Application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
Features:
a) It is the highest level of abstraction.
b) It describes only a part of the whole Database for particular group of users.
c) This view hides all complexity.
d) It exists only to simplify user interaction with system.
e) The system may provide many views for the whole system.
Data Independence in DBMS:
Data independence :- Data independence is one of the main
advantages of DBMS. The ability to modify a schema(structure of database)
definition in one level without offering a schema definition in the next higher
level is called a data independence.
There are two types of data independence :-
1) Physical data Independence :- Physical data
Independence is the ability to modify the physical schema without cauring
affect of the logical level. Or in other words we can say that the old programs
are not be changed when changes occurred in the physical story structure in
which data are stored.
2) Logical
data Independence :- Logical data Independence is the
modify the conceptual scheme without causing offers of the view level. Or in
other words we do not need to the security the programs
What are
the benefits of Database Approach?
The benefits of database approach is to
1. reduce redundancy of information
Data’s are being made simple and complex, we don’t have to input as many data’s which can only result in overloaded space in computer hard disk/ memory.
2. Consistent data flow
Once the data’s are being analyzed in sequence, once can easily identify the flow of information and hence, results will come in reliable output.
3.
Integration of data
Incorporating of data’s in tables should result in a highly assimilation of information.
Incorporating of data’s in tables should result in a highly assimilation of information.
4.
Security and User privileges
Protection of data’s from unwanted users and giving user’s rights to what level they are to use the application.
Protection of data’s from unwanted users and giving user’s rights to what level they are to use the application.
5. Ease
of application development.
It is a more comfortable workspace for the relevance of its factual purpose.
It is a more comfortable workspace for the relevance of its factual purpose.
Data independence is a form
of database management that keeps data separated from all
programs that make use of the data. As a cornerstone for the idea of
a DBMS or databasemanagement
system, data independence ensures that the data cannot
be redefined or reorganized by any of the programs that make use of the data.
In this manner, the data remains accessible, but is also stable and
cannot be corrupted by the applications using it.
Database management within the environment of a centralized
DBMS relies on the process ofdata independence.
While data transparency still exists as far as the ability of
different applications to access and use the data for completing
tasks, no program or application can read and then begin to make changes to
the data itself. The consistency of the data makes the
overall process of maintaining and managing a single database or
multiple databases within a single environment much easier.
The concept of data independence is different
from another approach referred to as datadependence. With this approach,
the data used to run an application is actually represented in
the source code for the program. The ability of the data to
be modified by the program, and vice versa, means that the data is
not considered constant and may be changed either by initiating the change
through modifications to the source code or through using the program in other
ways to adapt the core data.
Data
Abstraction: - Purpose of
database users with an abstract view of the data that is system hides certain
details of how the data are stored and maintained. It gives an architecture is
to separate the user applications and the physical database. It defines in the
following three levels:-
1) Internal
level or physical level: - The represent data as they would be actually
organized on physical storage media. The internal schema uses a physical data
model and describes the complete detail of data storage and access path for the
data base. It is the lowest and complex level.
2) Conceptual
level or logical level: - Logical levels hide the detail of physical
storage and describe about the entities data and constraints. It is the middle
level and it is used by the application programmer it describes what data are
stored in the database and what relationship among those data. All
logical tasks are performed at this level.
3) View
level or external level: - View level representation of data as they would
appear to an application programmer or end user. It is used only by the naïve
user. Only a required portion of data is viewed.
Etity-relationship
model (ERM or ER model):
The
entity-relationship model (or ER model) is a way of graphically representing
the logical relationships of entities (or objects) in order to create a
database. The ER model was first proposed by Peter Pin-Shan Chen of
Massachusetts Institute of Technology (MIT) in the 1970s.
In ER
modeling, the structure for a database is portrayed as a diagram, called an
entity-relationship diagram (or ER diagram), that resembles the graphical
breakdown of a sentence into its grammatical parts. Entities are rendered
as points, polygons, circles, or ovals. Relationships are portrayed as lines
connecting the points, polygons, circles, or ovals. Any ER diagram has an
equivalent relational table, and any relational table has an equivalent ER
diagram. ER diagramming is an invaluable aid to engineers in the design,
optimization, and debugging of database programs.
In a
logical sense, entities are the equivalent of grammatical nouns, such as
employees, departments, products, or networks. An entity can be defined by
means of its properties, called attributes. Relationships are the equivalent of
verbs or associations, such as the act of purchasing, the act of repairing,
being a member of a group, or being a supervisor of a department. A
relationship can be defined according to the number of entities associated with
it, known as the degree.
3NF vs
BCNF:
Normalization
is a process that is carried out to minimize the redundancies that are present
in data in relational databases. This process will mainly divide large tables
in to smaller tables with fewer redundancies. These smaller tables will be
related to each other through well defined relationships. In a well normalized
database, any alteration or modification in data will require modifying only a
single table. Third Normal Form (3NF) was introduced in 1971 by Edgar F. Codd,
who is also the inventor of the relational model and the concept of
normalization. Boyce-Codd Normal Form (BCNF) was introduced in 1974 by Codd and
Raymond F. Boyce.
What is
3NF?
3NF is
the Third normal form used in relational database normalization. According to
the Codd’s definition, a table is said to be in 3NF, if and only if, that table
is in the second normal form (2NF), and every attribute in the table that do
not belong to a candidate key should directly depend on every candidate key of
that table. In 1982 Carlo Zaniolo produced a differently
expressed definition for 3NF. Tables that comply with the 3NF generally do not
contain anomalies that occur when inserting, deleting or updating records in
the table.
What is
BCNF?
BCNF
(also known as 3.5NF) is another normal form used in relational database
normalization. It was introduced to capture some the anomalies that are not
addressed by the 3NF. A table is said to be in BCNF, if and only if, for each
of the dependencies of the form A → B that are non-trivial, A
is a super-key. Decomposing a table that is not in the BCNF normal form does
not guarantee the production of tables in the BCNF form (while preserving the
dependencies which were present in the original table).
What is the
difference between 3NF and BCNF?
Both
3NF and BCNF are normal forms that are used in relational databases to minimize
redundancies in tables. In a table that is in the BCNF normal form, for every
non-trivial functional dependency of the form A → B, A is a super-key whereas, a table that complies with 3NF
should be in the 2NF, and every non-prime attribute should directly depend on
every candidate key of that table. BCNF is considered as a stronger normal form
than the 3NF and it was developed to capture some of the anomalies that could
not be captured by 3NF. Obtaining a table that complies with the BCNF form will
require decomposing a table that is in the 3NF. This decomposition will result
in additional join operations (or Cartesian products) when executing queries.
This will increase the computational time. On the other hand, the tables that
comply with BCNF would have fewer redundancies than tables that only comply
with 3NF. Furthermore, most of the time, it is possible to obtain a table that
comply with 3NF without hindering dependency preservation and lossless joining.
But this is not always possible with BCNF.
Domain
constraints:
Domain
constraints are a user-defined data type which enforces the integrity of the standard
data types. A domain consists of all values permitted in a column. In
constructing your database you would first select a standard data type such as
VARCHAR or INT. You could then constrain or limit the data using NOT NULL,
UNIQUE, CHECK, PRIMARY KEY or FOREIGN KEY. For example the CHECK constraint
could limit INT data within your domain [column] to numbers less than 5000. You
would thus define a custom data type using domain constraints.
A domain is defined as the set of all unique values permitted
for an attribute. For example, a domain of date is the set of all possible
valid dates, a domain of integer is all possible whole numbers, a domain of
day-of-week is Monday, Tuesday ... Sunday.
This in
effect is defining rules for a particular attribute. If it is determined that
an attribute is a date then it should be implemented in the database to prevent
invalid dates being entered.
A
classic example of this is where the data from a legacy system is loaded into a
newly designed database. The new system is well designed. Columns that hold
dates are defined as such whereas, in the old system, they were held as
character strings. Much data is rejected because of invalid dates, eg 30
February 2000.
If the
system supports domain constraints then this invalid data would not have stored
in the first place. That is, the integrity of the database is being preserved.
RELATIONAL ALGEBRA:
It’s a theoretical
language which operations work on one or more relations to define
another
relation without changing
the original relation...
FACILITIES: (operations)
of relational algebra:
1. Selection(restriction).
Selection of rows E,g salary>100(staff).those employ who have
2. salary more than
1000in salary table ..O condition (R) e.g O salary>100 (Staff)
3. Projection:
select columns from a relation. ∏ col1,…….coln (R)…… ∏ sno,fname,address
4. (Student)
5. Cartesian Product:
concatinate rows of one relation to other relation’s record. R * S..
(Sid,name)(strudent)*(cid)(courses)..sid , name are the record of student
relations &
6. cid is the
record of course table
7. Union: to join
two relations…..R ⋃S
union compatible..>data type must be same .
No duplication: no duplicate rows.
1. Difference:
tuples which are in R but not n S . R - S ..wat r the members which are
2. in one society
but not n another society.
3. Joins Operators:
joining of two relations. Join operators are
o Theta
Join :matching of two field in two relation. It include
=,<,><=,=!.....R ⋈condition S
o Equi
join: Teta is general form so for equal it s equal join, for the greater it s
greater
o join.
o Natural
join: if there are two columns that will become one.
o Outer
join: those which are matching & those which are not matching that is
complete
o relation. ⊳⊲
o Semi
join: DDB(distributd DB) take relations from one side & then relations from
other
o side.
R ⊳f S…where f s
condition or predicate. protocol
4. Intersection:
tuples that r n both R & S & must be union-compatible. R ⋂S
No comments:
Post a Comment