Question

Computer Science

Posted 4 months ago

```
Let's say we have a dataset that consists of two classes (i.e. "y+" &"y-") in the target column. For a particular feature the probability of both the
classes (i.e. p(y+) & p(y-)) are 0.5 respectively. What will be the entropy of the feature?
```

Sign in to unlock the answer

Answer from Sia

Posted 4 months ago

To calculate the entropy of a feature in a dataset for a binary classification problem where the probabilities of each class are equal (p(y+) = p(y-) = 0.5), we use the formula for entropy in the context of information theory

The entropy $H$ of a random variable is a measure of the amount of uncertainty or impurity in the dataset and is given by the formula:

$H(Y) = -\sum_{i=1}^{n} p(y_i) \log_2 p(y_i)$

In this case, since we have two classes with equal probability, the formula simplifies to:

$H(Y) = -[p(y+) \log_2 p(y+) + p(y-) \log_2 p(y-)]$

Substituting the given probabilities into the formula, we get:

$H(Y) = -[0.5 \log_2 0.5 + 0.5 \log_2 0.5]$

$H(Y) = -[0.5 \times (-1) + 0.5 \times (-1)]$

$H(Y) = 1$

Therefore, the entropy of the feature with equal class probabilities of 0.5 is 1 bit. This indicates maximum uncertainty or impurity, as the classes are equally likely.

Not the question you are looking for? Ask here!

Enter question by text

Enter question by image

Upgrade to Asksia Pro

Join a AskSia's Pro Plan, and get 24/7 AI tutoring for your reviews, assignments, quizzes and exam preps.

Unlimited chat query usages

Strong algorithms that better know you

Early access to new release features

Study Other Question