A classifier that predicts multiple sub-sets of classes with cofidence attached to each one

I am curious if anybody did any work on classifiers that could be trained on a dataset where each sample belongs to more than one class and then predict not just a single class for each test sample, and even not just a set of options with different confidence, but associate a confidence with every (or just some) sub-set of classes in relation to each test sample.

For example if we have pictures as samples and three possible labels could be attached to each picture: beach, water, desert, it could say for some sample that it would give high confidences to (desert), (beach, water), (beach), lower confidence to just (water), and very low confidence to (desert, water) and (desert, beach)? The example is not very good, but I hope it explains what is it I am looking for.

I am just interested in some reading on the topic. Thank you.

Examples of statistical learning

When could I find simple real-life examples of the usage of the theory of statistical machine learning?

As I learned from Vapnik V.N., "The Nature of Statistical
Learnig Theory" (1999), there is a classical task:

There are two assymmetric dices. There is large list of results of
independent trials (pairs of data) for the dices. How to calculate the
propability distribution for each dice which altogether will give the
propability distribution of sum of points of dices as close as
possible to propability distribution function which we already know
from other source (so the results of trials and the propability distribution function of sum are known. The propability distribution function of points in dices are required)?

ML/DM and other relevant conferences calendar

Google Calendar is a great tool. It's a pity it does not work well with Outlook and Pocket Outlook.

Anyway, I've created a calendar of Machine Learning, Data Mining and other relevant conferences in GCal and going to populate it further over the time. It is public, so you can view the events. If you want to participate in population of the calendar, please do contact me, I will share it with your GCal account.

Text search + CBIR = tagging?

Let say we have a large enough image library. Consider the following image retrieval scenario:

Some user is looking for an image for his web site design, for example. The easiest UI for it is a text search, so the user types the keywords and retrieves some images as a result.

The problem here is that the association of keywords with images is usually quite poor in such libraries. The difference in vocabularies used by the contributors of images and the searcher is also an issue. Thus the user may not necessarily obtain the results that match his needs well enough.

At this stage, CBIR can be used. It is a nice approach, allowing the user to select what images from the result match his needs best (and, in some techniques, also what do not match it at all), and force the system to return a new result set matching the needs of user better than the original one.

So in few CBIR iterations the user finds what he needs.

Let me repeat it: first one step of a text search, then few steps of CBIR.

Now comes the obvious idea: what if we keep the keywords of the initial text search query as tags associated with the images marked by user as relevant? In that case we actually make not only the contributors, but also the majority of users of such a library to tag images in it, which enforce the text searching, possibly reducing the required number of CBIR steps.

I am not dealing with CBIR myself, so I did not come across any publication describing such an idea so far. Although I am pretty sure some must exist.


The other interesting resource is MLpedia, a Wiki-based Machine Learning encyclopedia project started by John Winn from Microsoft Research @ Cambridge. Few other individuals are contributing to it. The number of articles is not that large yet, with quite a few stubs instead of full articles, but it is open for contributions from anybody, so it may become a vaulable resource over the time.