Category: Data Warehousing and Data Mining

Home B.Sc. CSIT 7th Semester Data Warehousing and Data Mining
Post

Introduction to Data Preprocessing and its Types

What is Data preprocessing? Data preprocessing is a data mining technique that involves transforming incomplete, inconsistent, and/or noisy data which increase chances of error and misinterpretation, into an understandable format. Incomplete: Lacking attribute values, lacking certain attributes of interest, or containing only aggregate data. E.g., occupation = “” Noisy: Noisy data is a meaningless data...

Post

Why do we need Data Mart? Types of Data Mart

What is Data Mart? Data mart is subset of data warehouse oriented to particular business subject line. The data mart contains repository of summarized data collected for analysis for specific section or unit within organizations. It is controlled by only single department in organization. Data Mart usually draws a data from only few sources compared...

Post

Challenges of Data Mining

In several sectors, data mining and knowledge discovery is becoming a critical technology for businesses and researchers. While data mining is becoming a well-established and reputable subject, there are still numerous difficulties to overcome. Some of the challenges are: Mining methodology and user interaction issues  It refers to the following kinds of issues  Mining different...

Post

Study about Data Object and Attribute Types

Data sets are made up of data objects.  A data object represents an entity.  Examples: – sales database: customers, store items, sales – medical database: patients, treatments – university database: students, professors, courses . Also called samples , examples, instances, data points, objects, tuples.   Data objects are described by attributes.   Database rows -> data objects;...

Post

Knowledge Discovery in Database(KDD)

Knowledge Discovery in Databases (KDD) Knowledge Discovery in a database is the process of discovering useful knowledge from a collection of data .This widely used data mining technique is a process that includes data preparation and selection, data cleansing incorporating prior knowledge on data sets and interpreting accurate solutions from the observed results. Knowledge discovery...

Post

Data Mining functionalities

Data mining functions are used to define the kind of patterns that will be discovered during data mining jobs. Some of the major data mining functionalities are as follows:  Class/ concept descriptions: Characterization and Discrimination Class/concept descriptions are the definitions of a class or idea. Data features should be generalised, summarised, and contrasted. For example,...

Post

Introduction To Data Mining

Motivation for Data Mining Over the last three decades, the steady and remarkable advancement of computer hardware technology has resulted in a large supply of powerful and affordable computers, data collection equipment, and storage media. This technology provides a significant boost to the database and information industries, allowing for the availability of a large number...

Post

What are the Components of Data Warehouse and it’s need?

Components of Data Warehouse A typical data warehouse consists of four major components: a central database, ETL (extract, transform, and load) tools, metadata, and access tools. All of these components are designed to work rapidly, allowing you to acquire findings and analyze data on the fly. The following are the four major components of a...

Post

Architecture of data warehouse

Three-tier data warehouse architecture is the most widely used architecture of data warehouse as it produces a well-organized data flow from raw information to valuable insights.   It consists of the Top, Middle and Bottom Tier.   The top tier is the front-end client tools that presents results through reporting, analysis, and data mining tools.  The middle...

Post

Conceptual Modelling of Data Warehouse

The conceptual data model is a structured business view of data required to support business, processes,record business events and track related performance measures. This model focuses on identifying data  used in business but not its processing flow or physical characteristics. It is a concise description of the user’s data requirements without taking into account implementation...