A database is an organized collection of data. The data are typically organized to model
relevant aspects of reality in a way that supports processes requiring this
information. For example, modeling the availability of rooms in hotels in a way
that supports finding a hotel with vacancies.
1.
History
processors, computer memory, computer storage and computer networks, the sizes, capabilities, and
performance of databases and their respective DBMSs have grown in orders of
magnitude. The development of database technology can be divided into three
eras based on data model or structure: navigational,[5] SQL/relational, and
post-relational.
Technology progress in the areas of
Technology progress in the areas of
The two main early navigational data models were the hierarchical model,
epitomized by IBM's IMS system, and the CODASYL model
(network model), implemented in a number of products
such as IDMS.
The relational model, first proposed in 1970 by Edgar F. Codd, departed from this tradition by
insisting that applications should search for data by content, rather than by
following links. The relational model employs sets of ledger-style tables, each
used for a different type of entity. Only in the mid-1980s did computing
hardware became powerful enough to allow the wide deployment of relational
systems (DBMSs plus applications). By the early 1990s, however, relational
systems dominated in all large-scale data processing applications, and as of
2014 they remain dominant except
in niche areas. The dominant database language, standardised SQL for the
relational model, has influenced database languages for other data models.[citation needed]
Object databases developed in the 1980s to overcome the
inconvenience of object-relational
impedance mismatch, which led to the coining of the term
"post-relational" and also the development of hybrid object-relational databases.
The next generation of post-relational databases in the late 2000s became known
as NoSQL databases, introducing fast key-value stores and document-oriented databases.
A competing "next generation" known as NewSQL databases attempted new
implementations that retained the relational/SQL model while aiming to match
the high performance of NoSQL compared to commercially available relational
DBMSs.
Database management systems (DBMSs) are specially designed software applications
that interact with the user, other applications, and the database itself to
capture and analyze data. A general-purpose DBMS is a software system
designed to allow the definition, creation, querying, update, and
administration of databases. Well-known DBMSs include MySQL, MariaDB,PostgreSQL, SQLite, Microsoft SQL Server, Oracle, SAP HANA, dBASE, FoxPro, IBM DB2, LibreOffice Base and FileMaker Pro. A database is not generally portable across different DBMSs, but different
DBMSs can interoperate by using standards such as SQL and ODBC or JDBC to allow a single application to work
with more than one database. Formally, "database" refers to the data
themselves and supporting data structures. Databases are created to operate
large quantities of information by inputting, storing, retrieving and managing
that information. Databases are set up so that one set of software programs
provides all users with access to all the data. A "database management
system" (DBMS) is a suite of computer software providing the interface
between users and a database or databases. Because they are so closely related,
the term "database" when used casually often refers to both a DBMS
and the data it manipulates. Outside the world of professional information technology, the term database is
sometimes used casually to refer to any collection of data (perhaps a spreadsheet,
maybe even a card index). This article is concerned only with databases where
the size and usage requirements necessitate use of a database management
system.
The interactions catered for by most existing
DBMSs fall into four main groups:
·
Data
definition – Defining new
data structures for a database, removing data structures from the database,
modifying the structure of existing data.
·
Update – Inserting, modifying, and deleting data.
·
Retrieval – Obtaining information either for
end-user queries and reports or for processing by applications.
Most organizations in developed countries today depend on
databases for their business operations.
Increasingly, databases are not only used to support the internal operations of
the organization, but also to underpin its online interactions with customers
and suppliers (see Enterprise software).
Databases are not used only to hold administrative information, but are often
embedded within applications to hold more specialized data: for example
engineering data or economic models. Examples of database applications include
computerized library systems, flight reservation systems,
and computerized parts inventory systems.
Client-server or transactional DBMSs are often complex to maintain
high performance, availability and security when
many users are querying and updating the database at the same time. Personal,
desktop-based database systems tend to be less complex. For example, FileMaker and Microsoft Access come
with built-in graphical user interfaces.
1960s, navigational DBMS
The introduction of the term database coincided
with the availability of direct-access storage (disks and drums) from the
mid-1960s onwards. The term represented a contrast with the tape-based systems
of the past, allowing shared interactive use rather than daily batch processing. The Oxford English
dictionary cites[6] a 1962 report by the System
Development Corporation of California as the first to use the term
"data-base" in a specific technical sense.
As computers grew in speed and capability, a
number of general-purpose database systems emerged; by the mid-1960s a number
of such systems had come into commercial use. Interest in a standard began to
grow, and Charles Bachman,
author of one such product, theIntegrated Data Store (IDS),
founded the "Database Task Group" within CODASYL, the group responsible for the
creation and standardization of COBOL.
In 1971 theDatabase Task Group delivered their standard, which generally became
known as the "CODASYL approach", and soon a number of commercial
products based on this approach entered the market.
The CODASYL approach relied on the
"manual" navigation of a linked data set which was formed into a
large network. Applications could find records by one of three methods:
·
use of a primary key
(known as a CALC key, typically implemented by hashing)
·
navigating relationships
(called sets) from one record to another
·
scanning all the records
in a sequential order.
Later systems added B-Trees to provide alternate
access paths. Many CODASYL databases also added a very straightforward query
language. However, in the final tally, CODASYL was very complex and required significant
training and effort to produce useful applications.
IBM also
had their own DBMS system in 1968, known as IMS. IMS was
a development of software written for the Apollo program on theSystem/360. IMS was generally similar in
concept to CODASYL, but used a strict hierarchy for its model of data
navigation instead of CODASYL's network model. Both concepts later became known
as navigational databases due to the way data was accessed, and Bachman's 1973 Turing Award presentation was The
Programmer as Navigator. IMS is classified[by
whom?] as a hierarchical database.
IDMS and Cincom Systems' TOTAL database are classified as network
databases. IMS remains in use as of 2014.[7]
1970s, relational DBMS
Edgar Codd worked
at IBM in San Jose, California,
in one of their offshoot offices that was primarily involved in the development
of hard disksystems. He was unhappy with the
navigational model of the CODASYL approach, notably the lack of a
"search" facility. In 1970, he wrote a number of papers that outlined
a new approach to database construction that eventually culminated in the
groundbreaking A Relational Model of Data for Large Shared Data Banks.[8]
In this paper, he described a new system for
storing and working with large databases. Instead of records being stored in
some sort of linked list of
free-form records as in CODASYL, Codd's idea was to use a "table" of fixed-length records, with each
table used for a different type of entity. A linked-list system would be very
inefficient when storing "sparse" databases where some of the data
for any one record could be left empty. The relational model solved this by
splitting the data into a series of normalized tables (or relations),
with optional elements being moved out of the main table to where they would
take up room only if needed. Data may be freely inserted, deleted and edited in
these tables, with the DBMS doing whatever maintenance needed to present a
table view to the application/user.
Integrated approach
In the 1970s and 1980s attempts were made to
build database systems with integrated hardware and software. The underlying
philosophy was that such integration would provide higher performance at lower
cost. Examples were IBM System/38, the early
offering of Teradata, and the Britton Lee, Inc. database machine.
Another approach to hardware support for
database management was ICL's CAFS accelerator,
a hardware disk controller with programmable search capabilities. In the long
term, these efforts were generally unsuccessful because specialized database
machines could not keep pace with the rapid development and progress of
general-purpose computers. Thus most database systems nowadays are software
systems running on general-purpose hardware, using general-purpose computer
data storage. However this idea is still pursued for certain applications by
some companies like Netezza and Oracle (Exadata).
Late 1970s, SQL DBMS
IBM started working on a prototype system
loosely based on Codd's concepts as System R in the early
1970s. The first version was ready in 1974/5, and work then started on
multi-table systems in which the data could be split so that all of the data
for a record (some of which is optional) did not have to be stored in a single
large "chunk". Subsequent multi-user versions were tested by
customers in 1978 and 1979, by which time a standardized query language – SQL[citation needed] –
had been added. Codd's ideas were establishing themselves as both workable and
superior to CODASYL, pushing IBM to develop a true production version of System
R, known as SQL/DS, and, later, Database 2 (DB2).
Larry Ellison's Oracle started from a
different chain, based on IBM's papers on System R, and beat IBM to market when
the first version was released in 1978.[
Stonebraker went on to apply the lessons from
INGRES to develop a new database, Postgres, which is now known as PostgreSQL.
PostgreSQL is often used for global mission critical applications (the .org and
.info domain name registries use it as their primary data store, as do many
large companies and financial institutions).
In Sweden, Codd's paper was also read and Mimer SQL was developed from the
mid-1970s at Uppsala University.
In 1984, this project was consolidated into an independent enterprise. In the
early 1980s, Mimer introduced transaction handling for high robustness in
applications, an idea that was subsequently implemented on most other DBMSs.
Another data model, the entity-relationship
model, emerged in 1976 and gained popularity for database design as it emphasized a more
familiar description than the earlier relational model. Later on,
entity-relationship constructs were retrofitted as a data modeling construct
for the relational model, and the difference between the two have become
irrelevant.[citation needed]
1980s, on the desktop
The 1980s ushered in the age of desktop computing. The new computers empowered
their users with spreadsheets like Lotus 1,2,3 and database software like
dBASE. The dBASE product was lightweight and easy for any computer user to
understand out of the box. C. Wayne Ratliff the creator of dBASE
stated: “dBASE was different from programs like BASIC, C, FORTRAN, and COBOL in
that a lot of the dirty work had already been done. The data manipulation is
done by dBASE instead of by the user, so the user can concentrate on what he is
doing, rather than having to mess with the dirty details of opening, reading,
and closing files, and managing space allocation.“ [14] dBASE was one of the top
selling software titles in the 1980s and early 1990s.
1980s, object-oriented
The 1980s, along with a rise in object-oriented
programming, saw a growth in how data in various databases were
handled. Programmers and designers began to treat the data in their databases
as objects. That is to say that if a person's data were in a database, that
person's attributes, such as their address, phone number, and age, were now
considered to belong to that person instead of being extraneous data. This
allows for relations between data to be relations to objects and their attributes
and not to individual fields.[15] The term
"object-relational impedance mismatch" described the inconvenience of
translating between programmed objects and database tables. Object databases
and object-relational databases attempt to solve this problem by providing an
object-oriented language (sometimes as extensions to SQL) that programmers can
use as alternative to purely relational SQL. On the programming side, libraries
known as object-relational
mappings (ORMs) attempt to solve the same problem.
2000s, NoSQL and NewSQL
The next generation of post-relational databases
in the 2000s became known as NoSQL databases, including fast key-value stores
and document-oriented databases. XML databases are a type of structured
document-oriented database that allows querying based on XML document
attributes.
NoSQL databases are often very fast, do not
require fixed table schemas, avoid join operations by storing denormalized data, and are designed to scale horizontally.
In recent years there was a high demand for
massively distributed databases with high partition tolerance but according to
the CAP theorem it is impossible for a distributed system to
simultaneously provide consistency, availability and partition
tolerance guarantees. A distributed system can satisfy any two
of these guarantees at the same time, but not all three. For that reason many
NoSQL databases are using what is called eventual consistency to
provide both availability and partition tolerance guarantees with a maximum
level of data consistency.The most popular NoSQL systems include: MongoDB, Couchbase, Riak, memcached, Redis, CouchDB, Hazelcast, Apache Cassandra and HBase.[16] Note that all are open-source softwareproducts.
A number of new relational databases continuing
use of SQL but aiming for performance comparable to NoSQL are known as NewSQL.
Examples
One way to classify databases involves the type
of their contents, for example: bibliographic, document-text, statistical, or
multimedia objects. Another way is by their application area, for example:
accounting, music compositions, movies, banking, manufacturing, or insurance. A
third way is by some technical aspect, such as the database structure or
interface type. This section lists a few of the adjectives used to characterize
different kinds of databases.
·
An in-memory database is a database that
primarily resides in main memory, but is typically backed-up by non-volatile
computer data storage. Main memory databases are faster than disk databases,
and so are often used where response time is critical, such as in
telecommunications network equipment.[17]SAP HANA platform
is a very hot topic for in-memory database. By May 2012, HANA was able to run
on servers with 100TB main memory powered by IBM. The co founder of the company
claimed that the system was big enough to run the 8 largest SAP customers.
·
An active database includes
an event-driven architecture which can respond to conditions both inside and
outside the database. Possible uses include security monitoring, alerting,
statistics gathering and authorization. Many databases provide active database
features in the form of database
triggers.
·
A cloud database relies
on cloud technology. Both the database and most of
its DBMS reside remotely, "in the cloud", while its applications are
both developed by programmers and later maintained and utilized by
(application's) end-users through a web browser and Open APIs.
·
Data warehouses archive
data from operational databases and often from external sources such as market
research firms. The warehouse becomes the central source of data for use by
managers and other end-users who may not have access to operational data. For
example, sales data might be aggregated to weekly totals and converted from
internal product codes to useUPCs so that they can be compared with ACNielsen data.
Some basic and essential components of data warehousing include retrieving,
analyzing, and mining data, transforming, loading and managing data so
as to make them available for further use.
·
A deductive database combines logic programming with
a relational database, for example by using the Datalog language.
·
A distributed database is one in which both
the data and the DBMS span multiple computers.
·
A document-oriented
database is designed for storing, retrieving, and managing document-oriented,
or semi structured data, information. Document-oriented databases are one of
the main categories of NoSQL databases.
·
An embedded
database system is a DBMS which is tightly integrated with an
application software that requires access to stored data in such a way that the
DBMS is hidden from the application’s end-users and requires little or no
ongoing maintenance.[18]
The first task of a database designer is to
produce a conceptual data model that reflects the
structure of the information to be held in the database. A common approach to
this is to develop an entity-relationship model, often with the aid of drawing
tools. Another popular approach is the Unified Modeling Language. A successful
data model will accurately reflect the possible state of the external world
being modeled: for example, if people can have more than one phone number, it
will allow this information to be captured. Designing a good conceptual data
model requires a good understanding of the application domain; it typically
involves asking deep questions about the things of interest to an organisation,
like "can a customer also be a supplier?", or "if a product is
sold with two different forms of packaging, are those the same product or
different products?", or "if a plane flies from New York to Dubai via
Frankfurt, is that one flight or two (or maybe even three)?". The answers
to these questions establish definitions of the terminology used for entities
(customers, products, flights, flight segments) and their relationships and
attributes.Producing the conceptual data model sometimes involves input from business processes, or the analysis of workflow in
the organization. This can help to establish what information is needed in the
database, and what can be left out. For example, it can help when deciding whether
the database needs to hold historic data as well as current data. Having
produced a conceptual data model that users are happy with, the next stage is
to translate this into a schema that
implements the relevant data structures within the database. This process is
often called logical database design, and the output is a logical data model expressed in the form
of a schema. Whereas the conceptual data model is (in theory at least)
independent of the choice of database technology, the logical data model will
be expressed in terms of a particular database model supported by the chosen
DBMS. (The terms data model and database model are
often used interchangeably, but in this article we use data model for
the design of a specific database, and database model for the
modelling notation used to express that design.) The most popular database model
for general-purpose databases is the relational model, or more precisely, the
relational model as represented by the SQL language. The process of creating a
logical database design using this model uses a methodical approach known as normalization. The goal of normalization is to
ensure that each elementary "fact" is only recorded in one place, so
that insertions, updates, and deletions automatically maintain consistency. The
final stage of database design is to make the decisions that affect
performance, scalability, recovery, security, and the like. This is often
called physical database design. A key goal during this stage is data
independence, meaning that the decisions made for performance
optimization purposes should be invisible to end-users and applications.
Physical design is driven mainly by performance requirements, and requires a
good knowledge of the expected workload and access patterns, and a deep
understanding of the features offered by the chosen DBMS. Another aspect of
physical database design is security. It involves both defining access control to
database objects as well as defining security levels and methods for the data
itself.
Replication
Occasionally a database employs storage redundancy by
database objects replication (with one or more copies) to increase data
availability (both to improve performance of simultaneous multiple end-user
accesses to a same database object, and to provide resiliency in a case of
partial failure of a distributed database). Updates of a replicated object need
to be synchronized across the object copies. In many cases the entire database
is replicated.
Database security concerns the use of a broad range of
information security controls to protect databases (potentially including the
data, the database applications or stored functions, the database systems, the
database servers and the associated network links) against compromises of their
confidentiality, integrity and availability. It involves various types or
categories of controls, such as technical, procedural/administrative and
physical. Database security is a specialist topic within the
broader realms of computer security, information security and risk management.
Security risks to database systems include, for
example:
·
Unauthorized or
unintended activity or misuse by authorized database users, database
administrators, or network/systems managers, or by unauthorized users or
hackers (e.g. inappropriate access to sensitive data, metadata or functions
within databases, or inappropriate changes to the database programs, structures
or security configurations);
·
Malware infections
causing incidents such as unauthorized access, leakage or disclosure of
personal or proprietary data, deletion of or damage to the data or programs,
interruption or denial of authorized access to the database, attacks on other
systems and the unanticipated failure of database services;
·
Overloads, performance
constraints and capacity issues resulting in the inability of authorized users
to use databases as intended;
·
Physical damage to
database servers caused by computer room fires or floods, overheating,
lightning, accidental liquid spills, static discharge, electronic
breakdowns/equipment failures and obsolescence;
·
Design flaws and
programming bugs in databases and the associated programs and systems, creating
various security vulnerabilities (e.g. unauthorized privilege escalation), data loss/corruption,
performance degradation etc.;
·
Data corruption and/or
loss caused by the entry of invalid data or commands, mistakes in database or
system administration processes, sabotage/criminal damage etc.
Many layers and types of information security control are appropriate
to databases, including:
·
Access control
·
Auditing
·
Authentication
·
Encryption
·
Integrity controls
·
Backups
·
Application security
·
Database
Security applying Statistical Method
Traditionally databases have been largely
secured against hackers through network security measures
such as firewalls, and network-based intrusion detection systems. While network
security controls remain valuable in this regard, securing the database systems
themselves, and the programs/functions and data within them, has arguably become
more critical as networks are increasingly opened to wider access, in
particular access from the Internet. Furthermore, system, program, function and
data access controls, along with the associated user identification,
authentication and rights management functions, have always been important to
limit and in some cases log the activities of authorized users and
administrators. In other words, these are complementary approaches to database
security, working from both the outside-in and the inside-out as it were.
Many organizations develop their own
"baseline" security standards and designs detailing basic security
control measures for their database systems. These may reflect general
information security requirements or obligations imposed by corporate information
security policies and applicable laws and regulations (e.g. concerning privacy,
financial management and reporting systems), along with generally-accepted good
database security practices (such as appropriate hardening of the underlying
systems) and perhaps security recommendations from the relevant database system
and software vendors. The security designs for specific database systems
typically specify further security administration and management functions
(such as administration and reporting of user access rights, log management and
analysis, database replication/synchronization and backups) along with various
business-driven information security controls within the database programs and
functions (e.g. data entry validation and audit trails).
Furthermore, various security-related activities (manual controls) are normally
incorporated into the procedures, guidelines etc. relating to the design,
development, configuration, use, management and maintenance of databases.
There are four main types of database
management systems (DBMS) and these are based upon their
management of database structures. In other words, the types of DBMS are entirely dependent upon how the
database is structured by that particular DBMS
Hierarchical DBMS
A DBMS is said to be hierarchical if the
relationships among data in the database are established in such a way that one
data item is present as the subordinate of another one or a sub unit. Here
subordinate means that items have "parent-child" relationships among
them. Direct relationships exist between any two records that are stored
consecutively. The data structure "tree" is followed by the DBMS to
structure the database. No backward movement is possible/allowed in the hierarchical database.
The hierarchical data model was developed by IBM
in 1968 and introduced in information management systems. This model is like a
structure of a tree with the records forming the nodes and fields forming the
branches of the tree. In the hierarchical model, records are linked in the form
of an organization chart. A tree structure may establish one-to-many
relationship.....
Network
DBMS
A DBMS is said to be a Network DBMS if the
relationships among data in the database are of type many-to-many. The relationships among
many-to-many appears in the form of a network. Thus the structure of a network database is extremely complicated
because of these many-to-many relationships in which one record can be used as
a key of the entire database. A network database is structured in the form of a
graph that is also a data structure. Though the structure of such a DBMS is highly
complicated however it has two basic elements i.e. records and sets to
designate many-to-many relationships. Mainly high-level languages such as Pascal, C++, COBOL and FORTRAN etc. were used to implement the
records and set structures.
Relational DBMS
A DBMS is said to be a Relational DBMS or RDBMS
if the database relationships are treated in the form of a table. There are
three keys on relational DBMS: relation, domain and attributes. A network means
it contains a fundamental constructs sets or records sets contains one to many
relationship,records contains fields statical
table that is composed of rows and columns is used to organize
the database and its structure and is actually a two dimension array in the computer memory. A number of RDBMSs are
available, some popular examples are Oracle, Sybase, Ingress,Informix, Microsoft SQL Server,
and Microsoft Access.
object-oriented
databases
Able to handle many new data types, including
graphics, photographs, audio, and video, object-oriented databases represent a
significant advance over their other database cousins. Hierarchical and network
databases are all designed to handle structured data; that is, data that fits
nicely into fields, rows, and columns. They are useful for handling small
snippets of information such as names, addresses, zip codes, product numbers,
and any kind of statistic or number you can think of. On the other hand, an
object-oriented database can be used to store data from a variety of media
sources, such as photographs and text, and produce work, as output, in a
multimedia format.[1]
·
Object-oriented
databases use small, reusable chunks of software called objects. The objects
themselves are stored in the object-oriented database. Each object consists of
two elements: 1) a piece of data (e.g., sound, video, text, or graphics), and
2) the instructions, or software programs called methods, for what to do with
the data. Part two of this definition requires a little more explanation. The
instructions contained within the object are used to do something with the data
in the object. For example, test scores would be within the object as would the
instructions for calculating average test score.
·
Object-oriented
databases have two disadvantages. First, they are more costly to develop.
Second, most organizations are reluctant to abandon or convert from those
databases that they have already invested money in developing and implementing.
However, the benefits to object-oriented databases are compelling. The ability
to mix and match reusable objects provides incredible multimedia capability.
Healthcare organizations, for example, can store, track, and recall CAT scans,
X-rays, electrocardiograms and many other forms of crucial data.
The DBMS has a number of advantages as compared to traditional
computer file processing approach. The DBA must keep in mind these benefits or
capabilities during designing databases, coordinating and monitoring the DBMS.
The major advantages of DBMS are described
below.
1. Controlling Data Redundancy:
In non-database systems (traditional computer file processing),
each application program has its own files. In this case, the duplicated copies
of the same data are created at many places. In DBMS, all the data of an
organization is integrated into a single database. The data is recorded at only
one place in the database and it is not duplicated. For example, the dean's
faculty file and the faculty payroll file contain several items that are
identical. When they are converted into database, the data is integrated into a
single database so that multiple copies of the same data are reduced to-single
copy.
In DBMS, the data redundancy can be controlled or reduced but is
not removed completely. Sometimes, it is necessary to create duplicate copies
of the same data items in order to relate tables with each other.
By controlling the data redundancy, you can save storage space.
Similarly, it is useful for retrieving data from database using queries.
2. Data Consistency:
By controlling the data redundancy, the data consistency is
obtained. If a data item appears only once, any update to its value has to be
performed only once and the updated value (new value of item) is immediately
available to all users.
If the DBMS has reduced redundancy to a minimum level, the
database system enforces consistency. It means that when a data item appears
more than once in the database and is updated, the DBMS automatically updates
each occurrence of a data item in the database.
3. Data Sharing:
In DBMS, data can be shared by authorized users of the
organization. The DBA manages the data and gives rights to users to access the
data. Many users can be authorized to access the same set of information
simultaneously. The remote users can also share same data. Similarly, the data
of same database can be shared between different application programs.
4. Data Integration:
In DBMS, data in database is stored in tables. A single database
contains multiple tables and relationships can be created between tables (or
associated data entities). This makes easy to retrieve and update data.
5. Integrity Constraints:
Integrity constraints or consistency rules can be applied to
database so that the correct data can be entered into database. The constraints
may be applied to data item within a single record or they may be applied to
relationships between records.
Examples:
The examples of integrity constraints are:
(i) 'Issue Date' in a library system cannot be later than the
corresponding 'Return Date' of a book.
(ii) Maximum obtained marks in a subject cannot exceed 100.
(iii) Registration number of BCS and MCS students must start with
'BCS' and 'MCS' respectively etc.
There are also some standard constraints that are intrinsic in
most of the DBMSs. These are;
Constraint
Name
|
Description
|
PRIMARY KEY
|
Designates a column or combination
of columns as Primary Key and therefore, values of columns cannot be repeated
or left blank.
|
FOREIGN KEY
|
Relates one table with another
table.
|
UNIQUE
|
Specifies that values of a column
or combination of columns cannot be repeated.
|
NOT NULL
|
Specifies that a column cannot
contain empty values.
|
CHECK
|
Specifies a condition which each
row of a table must satisfy.
|
Most of the DBMSs provide the facility for applying the integrity
constraints. The database designer (or DBA) identifies integrity constraints
during database design. The application programmer can also identify integrity
constraints in the program code during developing the application program. The
integrity constraints are automatically checked at the time of data entry or
when the record is updated. If the data entry operator (end-user) violates an integrity
constraint, the data is not inserted or updated into the database and a message
is displayed by the system. For example, when you draw amount from the bank
through ATM card, then your account balance is compared with the amount you are
drawing. If the amount in your account balance is less than the amount you want
to draw, then a message is displayed on the screen to inform you about your
account balance.
6. Data Security:
Data security is the protection of the database from unauthorized users.
Only the authorized persons are allowed to access the database. Some of the
users may be allowed to access only a part of database i.e., the data that is
related to them or related to their department. Mostly, the DBA or head of a
department can access all the data in the database. Some users may be permitted
only to retrieve data, whereas others are allowed to retrieve as well as to
update data. The database access is controlled by the DBA. He creates the
accounts of users and gives rights to access the database. Typically, users or
group of users are given usernames protected by passwords.
Most of the DBMSs provide the security sub-system, which the DBA
uses to create accounts of users and to specify account restrictions. The user
enters his/her account number (or username) and password to access the data
from database. For example, if you have an account of e-mail in the
"hotmail.com" (a popular website), then you have to give your correct
username and password to access your account of e-mail. Similarly, when you
insert your ATM card into the Auto Teller Machine (ATM) in a bank, the machine
reads your ID number printed on the card and then asks you to enter your pin
code (or password). In this way, you can access your account.
7. Data Atomicity:
A transaction in commercial databases is referred to as atomic
unit of work. For example, when you purchase something from a point of sale
(POS) terminal, a number of tasks are performed such as;
·
Company stock is
updated.
·
Amount is added in
company's account.
·
Sales person's
commission increases etc.
All these tasks collectively are called an atomic unit of work or
transaction. These tasks must be completed in all; otherwise partially
completed tasks are rolled back. Thus through DBMS, it is ensured that only
consistent data exists within the database
Although there are many advantages but the DBMS may also have some
minordisadvantages. These are:
1. Cost of Hardware & Software:
A processor with high speed of data processing and memory of large
size is required to run the DBMS software. It means that you have to upgrade
the hardware used for file-based system. Similarly, DBMS software is also Very
costly.
2. Cost of Data Conversion:
When a computer file-based system is replaced with a database
system, the data stored into data file must be converted to database files. It
is difficult and time consuming method to convert data of data files into
database. You have to hire DBA (or database designer) and system designer along
with application programmers; Alternatively, you have to take the services of
some software houses. So a lot of money has to be paid for developing database
and related software.
3. Cost of Staff Training:
Most DBMSs are often complex systems so the training for users to
use the DBMS is required. Training is required at all levels, including
programming, application development, and database administration. The
organization has to pay a lot of amount on the training of staff to run the
DBMS.
4. Appointing Technical Staff:
The trained technical persons such as database administrator and
application programmers etc are required to handle the DBMS. You have to pay
handsome salaries to these persons. Therefore, the system cost increases.
5. Database Failures:
In most of the organizations, all data is integrated into a single
database. If database is corrupted due to power failure or it is corrupted on
the storage media, then our valuable data may be lost or whole system stops.
distributed database
A distributed database is a database in
which storage devices are not all attached to a
common processing unit such as the CPU,[1] controlled
by a distributed database management system(together
sometimes called a distributed database system). It may be stored in multiple computers,
located in the same physical location; or may be dispersed over a network of
interconnected computers. Unlike parallel systems, in which the processors are
tightly coupled and constitute a single database system, a distributed database
system consists of loosely-coupled sites that share no physical components. System
administrators can distribute collections of data (e.g. in a database) across
multiple physical locations. A distributed database can reside on network servers on
the Internet,
on corporateintranets or extranets,
or on other company networks. Because they store data across
multiple computers, distributed databases can improve performance at end-user worksites
by allowing transactions to be processed on many machines, instead of being
limited to one.[2]
Two processes ensure that the distributed
databases remain up-to-date and current: replication and duplication.
1. Replication involves using specialized software
that looks for changes in the distributive database. Once the changes have been
identified, the replication process makes all the databases look the same. The
replication process can be complex and time-consuming depending on the size and
number of the distributed databases. This process can also require a lot of
time and computer resources.
2. Duplication, on the other hand, has less
complexity. It basically identifies one database as a master and then duplicates that
database. The duplication process is normally done at a set time after hours.
This is to ensure that each distributed location has the same data. In the
duplication process, users may change only the master database. This ensures
that local data will not be overwritten.
Both replication and duplication can keep the
data current in all distributive locations.
Besides distributed database replication and
fragmentation, there are many other distributed database design technologies.
For example, local autonomy, synchronous and asynchronous distributed database
technologies. These technologies' implementation can and does depend on the
needs of the business and the sensitivity/confidentiality of
the data stored in the database, and hence the price the business is willing to
spend on ensuring data security, consistency and integrity.
When discussing access to distributed databases, Microsoft favors
the term distributed query, which it defines in protocol-specific
manner as "[a]ny SELECT, INSERT, UPDATE, or DELETE statement that
references tables and rowsets from one or more external OLE DB data sources".[ Oracle provides
a more language-centric view in which distributed queries and distributed transactions form part of distributed
Above data is collected and shared from various sources
available on websites and in general article sections like newspaper, magazines
etc .data might not be 100% correct. Request all the users to re verify if
again. Web world group India
Data taken on dated
20/03/2014








