Study material generic cover image

Summary Business Intelligence & Business Analytics

Course
- Business Intelligence & Business Analytics
- 2022 - 2022
- Tilburg University (Tilburg University, Tilburg)
- Information Management
272 Flashcards & Notes
Scroll down to see the PDF preview!
  • This + 400k other summaries
  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo

A snapshot of the summary - Business Intelligence & Business Analytics

  • 1 BIBA

  • 1.1.2 Introduction to Databases

    This is a preview. There are 4 more flashcards available for chapter 1.1.2
    Show more cards here

  • What is a database and what does is consist of?

    • Definition database: “A collection of related tables, designed, maintained and utilized by multiple users, with software to update & query the data”
      Database system consists of 
      • 1. Data (the database) 
      • 2. Software 
      • 3. Hardware 
      • 4. Users 
  • What is a database management system (DBMS) and how is it used?
    Database management system (DBMS) is the software than controls the data 
    • Oracle, DB2, MS Access, MS SQL Server (Azure) 
    • MySQL (open source)
    Manipulation of data using a query language 
    • E.g. SQL (Structured Query Language) 

  • What are the database terminology?
    • Database 
    • Table = structured list of data of a specific type, divided by columns and rows 
    • Record/Tuple 
    • Field/Attribute 
      • Domain
  • 1.1.3 Relational database

    This is a preview. There are 3 more flashcards available for chapter 1.1.3
    Show more cards here

  • What is a primary key and foreign key?
    PK:  Field(s) that uniquely identifies each record in a table 
      • Bno (Book), Rno (Reader), Bno + Rno + Loan date (Loan) 
      • Null value (=no data entry) not allowed for PK
    FK: Attribute whose values match the primary key values in the related (parent) table 
    • E.g.: Vendor sells Products 
      • Precisely 1 vendor per product 
      • Conversely, a vendor might sell multiple products 
      • FK = ‘Vendor_Code’ [Product table]
  • What are 4 trends in the database world?
      • Trend 1: From disk-based to In-memory databases (e.g. SAP HANA)
      • Trend 2: From on-premise db to cloud db (e.g. MS Azure, Google cloud sql) 
        • db use becomes an operating expense instead of capital expense
      • Trend 3: No (not only) SQL databases
        • Hypothesis: For analytics, relational database are dominant
        • NoSQL Databases: Key-value, Document, Graph, Wide-column
      • Trend 4: Alternative data representations
        • For storing document-oriented files with hierarchies; use XQuery

  • 1.2.3 Data warehouse architectures

    This is a preview. There are 5 more flashcards available for chapter 1.2.3
    Show more cards here

  • Which DW development approaches are there and which is the best?
    • Data mart approach (bottom-up)
      • DW = a collection of data marts
      • Dimensional modeling
        • Consistency achieved by conformed dimensions
      • E.g.: Independent data marts, Bus, ‘Canned data warehouse
    • Enterprise dw approach (top-down)
      • DW = one integrated database
      • Entity-relationship modeling
      • E.g. Hub & spoke: EDW + dep. data marts, Federated DW
    • Which approach is best?
      • There is no one-size-fits-all strategy to DW, depending on: management’s information needs, inf. interdependence between organizational units, …
  • 1.6.2 Naïve Bayes

    This is a preview. There are 3 more flashcards available for chapter 1.6.2
    Show more cards here

  • Why is laplace smoothing needed?
    From previous slides, the probability of α given class c:
    • P( Outlook=“Sunny” | PlayTennis=“Yes” ) = 0
    • Problem:
      • An attribute value doesn’t occur with every class
      • Probability of α given class c becomes 0
    • Having a probability zero is problematic, because it wipes out all information in other probabilities
  • What is laplace smoothing?
    • Laplace Smoothing, or Correction, or Estimator 
      • Incorporates a small-sample correction in every probability computation
      • Increase the numerator/denominator 
      • Thus, no probability will be zero
  • What are advantages and disadvantages for Naive Bayes?
    • Naive Bayes is Not So Naïve:
      • Its beauty is in its simplicity
      • Ability to handle categorical variables directly
      • Computational efficient
      • Good classification performance, especially when the number of predictors is very large
    • Negative aspects:
      • Requires a very large number of records to obtain good results
      • Independence assumption may not hold for some attributes
  • 1.7.1 Quizes

    This is a preview. There are 3 more flashcards available for chapter 1.7.1
    Show more cards here

    • Consider you were given 8 items, i.e., records, with numerical variables X1 & X2 along with a dependent variable y that corresponds to color (blue/red). 
    • Plot illustrates data in 2D space 
    • Task → use k-Nearest neighbors with Euclidean distance to classify item (1,1)
    • Predict the class of new item (1,1) when using k-Nearest neighbors with Euclidean distance and K=3.
    • 1. Class of item (1,1) is red.
    • 2. Class of item (1,1) is blue
    Red

To read further, please click:

Read the full summary
This summary +380.000 other summaries A unique study tool A rehearsal system for this summary Studycoaching with videos
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

The following topics are covered in this summary