Summary: Resit 2018
 This + 400k other summaries
 A unique study and practice tool
 Never study anything twice again
 Get the grades you hope for
 100% sure, 100% understanding
Read the summary and the most important questions on Resit 2018

1 Introduction and Removal of Sensory Noise
This is a preview. There are 3 more flashcards available for chapter 1
Show more cards here 
(4 pt) We want to apply an outlier detection on our data. We are in doubt between using Chauvenet’s criterion and a Simple DistanceBased Approach. List two advantages of using the Simple DistanceBased Approach over Chauvenet’s criterion.
Simple DistanceBased approach does not have an assumption of a normal distribution. Furthermore, it can take multiple attributes into account at the same time. 
(7 pt) Consider Figure 1, showing a part of the sensor data of Cristiano. We want to use a lowpass filter to filter out the high frequency noise we observe in the signal. Argue what would be a suitable frequency and show what figure results after applying the lowpass filter.
The low frequency signal seems to finish a complete cycle within around
10 seconds (slightly more). Which is around 0.1 Hz. To be on the safe side we could the fc to 0.15 Hz. The resulting figure would be the same as the original except that the small variations around the low frequency data would no longer be present. 
2 Feature Engineering

(3 pt) Explain the difference between the frequency and the time domain.
”The time domain summarizes values values within a historical window
by considering the measure values and applying some aggregation function over it (e.g. mean, slope). The frequency domain considers the periodicity in the historical values and decomposes the signal into sinusoid functions with different frequencies, each having their own amplitude.” 
(3 pt) Imagine we know that only the most dominant frequency is relevant for making predictions in a dataset. Which of the three aggregation functions for the frequency domain would be best to apply for this case? Argue why.
The frequency with the highest amplitude since this would give us the most
dominant frequency at the different time points. 
(5 pt) Consider Table 1. Extend the table with a feature in the time domain for the respiration attribute in which you take the mean with a value of λ = 1. Explain how you came to your answer.
We consider the current and previous time time point (windows size of
λ = 1) and take the mean of those two. See table. 
(4 pt) Provide and explain the NLP pipeline that has been explained during thelecture which we use before we can start identifying attributes in text data.
 tokenization: identify sentences and words.
 lower case: change upper case into lower case.
 stemming: map each word to its stem.
 stop word removal: remove stop words from the resulting words.

3 Clustering

(5 pt) Given the table in the data, we want to apply a feature based distance metric to compute the distance between the two timeseries. Assume we use the mean as feature. Compute the distance between the time series of person 1 and person 2.
We take the mean of the first series (0) and of the second (0.5) and com
pute the distance (let’s say Euclidean), which ends up being \sqrt{(0 − 0.5)^2} = 0.5 
(6 pt) Compute the distance between the two time series using the cross correlation coefficient. Show how you came to your answer. In this explanation, also show what the optimal shift τ is.
For the distance we should multiple the numbers of the two series, given
a shift τ that we make (we actually minimize one divided by this number, which is the same as maximizing this number). With shifts: τ =0: 8*11+8*12+8*11+8*12=88968896=368
 τ = 1 (we shift person 1): 8*11 + 8*12 + 8*11 = 88 + 96 + 88 = 272
 τ =1(we shift person2): 8*12+8*11+12*8=96+88+96 = 280
a value of 280. 
(4 pt) Explain the difference between kmeans clustering and agglomerative clustering.
In kmeans clustering you find k clusters. Using agglomerative clustering,
you start with each point in each own cluster and combine clusters until you end up with all points in a single cluster. Hence, you have different numbers of clusters at different stages of this process. 
(5 pt) Explain the subspace clustering algorithm in words. What is the advantage of the clustering method compared to the other clustering approaches that have been discussed during the course?
You define ε distinct intervals for each attribute. You start with single
attributes and in each interval for that attribute look for dense units (that contain more than a certain number of points). These can be combined into clusters if they have a common face or when they are selected. After looking at the single attributes, you look at multiple attributes, find clus ters in there, etc. This advantage of the algorithm is that it can handle a large number of features, while the other clustering algorithms cannot.
 Higher grades + faster learning
 Never study anything twice
 100% sure, 100% understanding