This is the kind of thing I just don't understand the value or use of:
This paper is the report of a study conducted by five people – four at Stanford, and one at the University of Wisconsin — which tried to establish whether computer-generated algorithms could “recognize” literary genres. You take David Copperfield, run it through a program without any human input – “unsupervised”, as the expression goes – and … can the program figure out whether it’s a gothic novel or a Bildungsroman? The answer is, fundamentally, Yes: but a Yes with so many complications that make it necessary to look at the entire process of our study. These are new methods we are using, and with new methods the process is almost as important as the results.
So human beings, over a period of centuries, read many, many books and come up with heuristic schemes to classify them — identify various genres, that is to say, “kinds,” kinship groups. Then those human beings specify the features they see as necessary to the various kinds, write complex programs containing instructions for discerning those features, and run those programs on computers . . . to see how well (or badly) computers can replicate what human beings have already done?
I don't get it. Shouldn't we be striving to get computers to do things that human beings can’t do, or can't do as well? The primary value I see in this project is that it could be a conceptually clarifying thing to be forced to specify the features we see as intrinsic to genres. But in that case the existence of programmable computers becomes just a prompt, and one accidental, not essential, to the enterprise of thinking more clearly and precisely.