Giannina Segnini, Director of the Master of Science Data Concentration Program, Columbia University Journalism School
Ronny Rojas, Editor of Univision News’ Data Unit
More than 19 million people in the United States embarked on a cruise last year, and most of them probably thought they had bought a ticket from an American company. What few people know is that almost all of those ships are registered in other countries. That means foreign territory laws apply to any incident that occurs on board.
The surprise generated by that first piece of information motivated a group of students from the School of Journalism at Columbia University in New York to start an extensive compilation of international data on the cruise industry worldwide. The work was carried out as part of the course “Using data to investigate across borders.”
The team used tools to extract and consolidate more than 16 international data sources that report daily information on inspections carried out on ships in various ports of the world, the deficiencies identified in each inspection and the number of times ships were detained because of severe environmental, vessel, crew or passenger safety problems.
The students also identified, through the use of databases, the historical owners and operators of cruise ships, as well as their true beneficiaries. The analysis involved search matches in the addresses of the companies, their phone numbers and other variables that allowed students to establish relationships between companies.
The cruise industry, which has a worldwide annual economic impact of $120 billion, is dominated by two large companies with offices in Miami: Royal Caribbean Cruises Ltd and Carnival Cruise Line.
In total, the team was able to obtain and consolidate 85 different variables for each of the 411 cruise ships that meet the official definition of the U.S. Coast Guard: the capacity to accommodate 250 people or more.
Working with the data involved a strenuous process of cleaning and standardization because the 16 consolidated databases used different names and terminology to refer to vessel owners, inspections and deficiencies.
The final database permitted searching and allowed the team to find patterns in the most important features of each ship, such as age and previous flags and names, property information, technical history, incidents, arrests and annual reviews to which the vessel in question was subjected.
The data also included details about the accidents involving each ship, such as their cause and the number of people missing or dead.
During the more than four months of work, the students learned the use of data processing tools such as Open Refine, Tableau, and Python programming language; but above all, they learned how to take on a global large-scale research project, and had the experience of working alongside Univision’s team of professional journalists.
Meanwhile, at the Univision newsroom, journalists supplemented the research by analyzing other databases, such as the registration of crimes aboard cruise ships offered by the U.S. Coast Guard, the voluntary crime reports released by the companies themselves, as well as their financial reports before the Securities and Exchange Commission (SEC), and the list of lobbying donations and money that the main cruise companies invest in American politics.
The research project also traced the main cruise companies in the Panama Papers database, a leaked file of documents from the Panamanian law firm Mossack Fonseca compiled by the International Consortium of Investigative Journalists.
A team of reporters, graphic artists and video journalists from Univision News was in charge of analyzing and visualizing the data from Columbia University. They complemented the reporting and personal research with input from industry and legislative sources, as well as lawyers, cruise ship workers and passengers, all of whom tell their stories in this interactive article.
This is the first alliance forged by Univision News with an educational institution to produce a data journalism project. This is a new way of doing journalism in the digital age: finding facts of public interest and narrating stories that matter, through the analysis, contrast and contextualization of databases, using clear and attractive displays.
You can download the main database that we use, here.
Sources: Global Integrated Shipping Information System (GISIS), IHS Sea-Web, Equasis, Lloyd’s Intelligence, US Coast Guard, Paris MoU, Tokyo MoU, USCG Port State Control, Viña del Mar Agreement, Mediterranean MoU, Indian Ocean, Riyadh MoU, Caribbean Memorandum of Understanding, Abuja Mou, Black Sea MoU, International Association of Classification Societies (IACS), U.S. Securities and Exchange Commission, Cruise Line Incident Reporting Statistics, Carnival Voluntary Report of Alleged Crimes, Royal Caribbean Crime Allegation Statistics, Norwegian Cruise Line Voluntary Reporting Statistics, ICIJ-Panama Papers.