Size: a a a

AI / Искусственный Интеллект

2019 July 22

DP

Defragmented Panda in AI / Искусственный Интеллект
why do you think this data is not good for neural networks?

nn can work well with small size of input too
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
Status 1 is cancer
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
Many more column : 152
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
Defragmented Panda
why do you think this data is not good for neural networks?

nn can work well with small size of input too
My research papers say neural networks need thousands or millions rows. Oresle overfit
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
I have very small dataset of 1800 rows.
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
Luke Skywalker
I have very small dataset of 1800 rows.
ahhh, now i got what you mean by it.
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
Luke Skywalker
My research papers say neural networks need thousands or millions rows. Oresle overfit
thats right in general.

but there is more specific rule:

neural network will overfit if it has more neurons than 'rows of dataset'

use neural network with 100 neurons and it wont overfit.
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
I will have a very hard time defending neural networks on this data in my presentation
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
thats right. nn is a black box
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
My professor said already not to ven touch neural
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
thats a good reason, okay

(banks have similar requirements, for example)
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
Yes
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
I think my main problem is feature selection
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
what sort of operation can your tree do?

can it use only (binary) 0\1 data? or can it use (float) 0.376 data too?
источник

LS

Luke Skywalker in AI / Искусственный Интеллект
Random forest can use any data number /float/ category
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
why did you decide to use one hot encoding like this
age 18 = no
age 25 = yes
age 30 = no
male = yes
female = no

instead of float inputs like this:
age: 0.5
gender: 1.0
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
this would decrease amount of features by a factor of 2...3

seems like worth trying as for me
источник

DP

Defragmented Panda in AI / Искусственный Интеллект
also an option to cheat:

increase number of rules in your forest to 2000 or so

then your forest will 'overfit' on data too.

it will correctly answer on your dataset, but not on other.

might be enough to fool a professor
источник

О

Орхан in AI / Искусственный Интеллект
🌚
источник

О

Орхан in AI / Искусственный Интеллект
Its when your own neural got overfitted to pass an exam
источник