What are Black Swan events?
A Black Swan event is an event in human history that was unprecedented and unexpected at the point in time it occurred. However, after evaluating the surrounding context, domain experts (and in some cases even laymen) can usually conclude: “it was bound to happen”. Even though some parameters may differ (such as the event’s time, location, or specific type), it is likely that similar incidences have had similar effects in the past.
The term Black Swan originates from the (Western) belief that all swans are white because these were the only ones accounted for. However, in 1697 the Dutch explorer Willem de Vlamingh discovered black swans in Australia. This was an unexpected event in (scientific) history and profoundly changed zoology. After the black swan were discoverd, it seemed obvious that black swans had to exist just as other animals with varying colors were known to exist as well. In retrospect, the surrounding context (i.e., the observations about other animals) seemed to imply the Black Swan assumption – empirical evidence validated it.
Detecting and analyzing Black Swan events helps us to gain a better understanding why certain developments are recurring throughout history and what effects they have.
What is this site about?
The Black Swan application aids users in finding Black Swan events throughout modern history. For this purpose, the application identifies outliers in statistical data and associates them with historic events. An outlier is a point in a statistic that does not “fit” in the overall trend, e.g., an inflection point in a curve. The rules that are used to join events to outliers are determined automatically by using data mining techniques. The results can be explored using a web interface.
Details about the architecture of Black Swan can be found at the architecture page.
What is the motivation?
Black Swan was developed by a team of 12 Computer Science MSc students in the seminar BlackSwan: Automated annotation of global statistics. The seminar was supervised by Professor Felix Naumann and Johannes Lorey from the chair Information Systems at the Hasso Plattner Institute.
The goal is to automatically detect relationships similar to the ones indicated in this initial mock-up:
Where does the data come from?
- Events: The event data is collected from DBpedia, Freebase, NOAA, Correlates of War, EM-DAT and BBC Timelines. After identifying and merging duplicates, Black Swan contains information about more than 40,000 events.
- Statistics: The statistics are primarily obtained from data files available at the Gapminder project. Other sources like the Worldbank or Correlates of War will be integrated in the future. Black Swan provides more than 400 statistics containing over 547,000 outliers in about 200 countries.
- Locations: The unique locations for events and statistics are imported from GeoNames.