Sujet du projet 1 Remarque 2 Énoncé

Transcription

Sujet du projet 1 Remarque 2 Énoncé
Projet base de données
29 février 2016
ENS Cachan – 1AS2 2016
Grosshans Nathan
Sujet du projet
1
Remarque
Le sujet reprend, quasiment à l’identique, celui imaginé par Cristina Sirangelo pour les années précédentes.
Si l’énoncé est en anglais, les rapports et le système d’information final peuvent être en français ou en anglais,
à votre convenance.
2
Énoncé
You are requested to design and develop an information system that manages the data of several airlines.
The system has to provide, as its main functionality, an on-line flight reservation service. It also has to serve
as the main data management tool for the airlines.
The design process of an information system always starts with a series of interviews where engineers meet
clients to understand which are the requirements for the system to be designed. The outcome of such interviews
is usually a set of informal specifications of the data that the system has to model and the functionalities that
it should provide. These interviews usually go through higher and higher level of specification and often serve
as feedback for the initial design phases.
Assume that you (E) met the client (C) and that the following is the report of your interviews.
E. Good morning.
C. Good morning, please take a seat. . .
C. We would like to provide an on-line flight reservation service capable of handling uniformly data of
several airlines. We would like the system to be used on the one side by customers willing to book their
flight, and on the other side by each airline, to keep their data up-to-date.
E. Do you want each airline to be able to access only its own data?
C. Yes, although different airlines may share the same flights. In this case each airline must be able to
access the data of all the flights it shares.
E. I would need to ask you some more detailed questions about the nature of the data handled by the
airlines.
C. Sure, go ahead!
E. When you say “flight”, what do you mean?. . .
C. Ehm. . .
E. I’ll be more precise. Today I took the flight AF321 from Paris CDG to Rome FCO, but tomorrow, or
in 6 months, there will be an AF321 AirFrance flight from Paris CDG to Rome FCO as well. Are these
two different flights? What does the flight code mean then?
C. Oh, now I understand your question! A flight has a code (321 in your example) which is unique within
the airline (AirFrance in your example). The airline code (AF) followed by this unique code, gives you
what we call flight number. A flight with a given flight number may be scheduled to fly several times in
a week.
E. I see. . . so today’s and tomorrow’s flights are two instances of the same flight, the flight AF321. But
what can be changed between two different scheduled instances? The departure time I guess. . . what
about the departure and destination airports?
C. The departure and destination airports do not change with the scheduled instance: for instance all
scheduled instances of the flight AF321 fly from Paris CDG to Rome FCO.
E. . . . let me take a note, this is important, it is what we call functional dependency in database theory. . .
C. Functional what??
E. Never mind, just talking to myself. . . please continue.
C. On the other hand different scheduled instances of the same flight may have different departure times,
as well as different durations. The way all this changes depends on the airline timetable. . .
E. I will ask you later about the organization of the timetable, now I would like to understand more about
the flights.
1
C. As you prefer!
E. Is a flight with a given flight number always a direct one?
C. No. There are flights with stops. For instance the (Lufthansa) LH626 flight from Frankfurt to Doha
stops in Bahrain.
E. What does it mean that it stops? Do passengers have to get off and take another plane?
C. No. As a general rule, every instance of a single flight always uses a single aircraft. If the flight has
stops, passengers that have booked the whole flight just stay in the plane during the stop. But the flight
may be booked also partially. For instance, passengers may also book the LH626 flight from Frankfurt
to Bahrain, or from Bahrain to Doha. So the stop is also used to make some passengers get off and some
other passengers board.
E. I’ve never been on such a flight. . .
C. Well, indeed they are not many. . .
E. So the majority of flights have no stops. . . but can a flight have several stops?
C. In principle yes.
E. Can different instances of the same flight have different stops?
C. No. The segments the flight consists of, only depend on the flight number, not on the scheduled instance
of the flight.
E. Good. Now, how does the flight sharing works?
C. A flight has a main operating airline, and its flight number within the operating airline identifies the
flight. But the flight can be shared by other airlines. In this case the same flight has different numbers,
one for each airline that shares it. As usual the flight number is unique within the airline. So for instance
the flight AF5467 is also operated by KLM (whose airline code is KL) where it has flight number 3456.
So AF5467 and KL3456 are the same flight.
E. Ok, things are getting clearer...so let’s move to timetables now. How do the airlines schedule their flights?
C. Usually a flight is scheduled to fly on some given days of the week (monday, wednesday and thursday,
for instance), each day with a given departure and arrival time – as well as a given departure and arrival
time for each of the possible stops of the flight. This is what we call the timetable of the flight.
E. Do these departure and arrival times actually change every day of the week where the flight is scheduled?
C. Usually they do not. but the week may be partitioned into a few blocks, for instance the LH626 flight
from Frankfurt to Doha right now is scheduled as follows: from Monday to Friday it departs at 13:25,
it arrives in Bahrain at 21:45, it leaves Bahrain at 22:30 and it arrives in Doha at 23:30. While on
Saturdays and Sundays it departs at 10:50, it arrives in Bahrain at 20:10, it leaves Bahrain at 21:00 and
arrives in Doha at 21:55.
E. Why do you say “right now”, do you mean that the timetable of a flight may be different in different
periods of the year?
C. Of course. For instance the above time table of the flight LH626 was valid from Jan 12th to March
15th, 2009, but it has been replaced by another timetable from March 16th to Sep 14th, and in the rest
of 2009 the flight LH626 was not scheduled at all.
E. Do you want to keep record also of the past timetables?
C. Yes, as well as of the past bookings, it can be be useful for statistical analysis.
E. Can you tell me more about the booking service you want to provide?
C. The main services will be to compute one or more itineraries corresponding to the user request, and to
allow the user to purchase flight tickets corresponding to the chosen itinerary.
E. What is an itinerary?
C. It is a sequence of connected flights. Connected means that a flight in the sequence has to leave from
the same city where the preceding flight arrives.
E. You don’t require that the change takes place in the same airport?
C. No. But in the case the transfer requires an airport change we want the customer to be warned.
Moreover in general we would like the computation of the itinerary to be aware of the transfer time.
E. How?
C. The proposed itineraries should not have any transfer time of less than two hours. If there is an airport
change, then the minimum transfer time goes up to 3 hours.
E. I guess a stop of a flight is not considered as a flight transfer.
2
C. Right, it is not a flight transfer. So no check has to be performed on the stopping time when computing
the itinerary.
E. But how do I know that a change is actually a stop rather than a transfer?
C. It’s easy: in a stop both flights will have the same flight number.
E. Got it.
C. And you should also remember that all flight times are shown in the local time zone.
E. Yes. . . I guess that means I will have to maintain this time zone information somewhere to compute the
actual flight duration. . .
C. Sorry?
E. Er, nothing, I was just thinking out loud. . . By the way, how do you represent this time zone data?
C. Time zones are usually given as an offset with respect to the Greenwich Meridian Time.
E. Good, it’s pretty standard. Paris time zone is encoded as +0100: that’s one hour and zero minutes after
GMT. Similarly, San Francisco is -0800 or New Dehli +0530.
C. If you say so. . .
E. Never mind. . . Earlier you mentioned you wanted customers to actually purchase flights, I suppose this
means you have a pricing system?
C. Indeed. The base price of each flight is computed from the distance: 0.20 EUR/km for economy class
and 0.50 EUR/km in business class. Additionally, for the economy class, there are a number of discounts.
If the ticket is booked at least 28 days before the departure date you have a 60% discount. If you book
it between 27 and 14 days before departure it’s only 40% discount. Finally, between 13 and 7 days in
advance you still get 20% discount. Otherwise you pay the full base price.
E. I think I get the idea. And what about return trips? Do I have the same rate on both flights?
C. No, each discount is computed for each flight independently.
E. I would also need to know how many seats are dedicated to business class travelers.
C. True. 15% of the total number of seats of the aircraft are reserved for the business class.
E. Now, could you tell me a little bit about the overall user experience you were thinking of?
C. Sure, a typical user who wants to book a flight first enters the cities between which he wants to fly,
the dates when he wants to travel, whether he wants to book a return trip, and the class (economy or
business) he prefers. We only allow one ticket per one passenger to be booked each time. At this point,
the system should present the user with a list of possible itineraries, ordered by price or by total trip time
at the user choice. If the user selects to purchase one of these flights he will have to identify himself if he
has already an account or be proposed to register himself. He can then proceed to the actual booking.
Registered users, when they come back, should also have the option to see all the bookings they have
made with the system.
E. And will the airline companies access the same database?
C. Yes, it will be the same database but their accounts will have special permissions. They will be able to
change flight schedules, add new flights or cancel others. When a flight schedule is changed, the system
should check the itineraries that have been already booked which are affected by the change. If the flight
is cancelled, or the new schedule breaks the 2- or 3-hour minimum delay for a change, the system should
give the airline the list of all the concerned passengers.
E. And could you tell me a little bit more about the data the system will have to manage?
C. Sure. We manage the schedule of 17 companies which operate flights with 64 distinct types of aircraft
between more than 600 different airports all over the world. This results in about 30000 different
schedules.
E. And do you have an idea of the frequency of each of the different operations?
C. Roughly, we expect schedule changes to concern less than 1% of the flights every week. Itinerary
lookups and bookings are much much more frequent operations. We expect dealing with several dozens
of lookups per second, 5% of which will turn into actual bookings.
E. That many?
C. Well, you know, many flights are running each day, each with 150-200 seats and we do sell most of
them.
E. I see. . . Well, I think I have all the information I need to get me started. Thank you.
C. Thank you.
3
3
Jeu de données
Un jeu de données, sous forme de fichiers au format CSV, vous est fourni pour pouvoir tester votre système
d’information. Ci-dessous, la description du contenu de chaque fichier.
— Aircrafts (aircrafts.txt)
— Name: Name of the aircraft.
— ICAO: 4-letter ICAO code, if available.
— IATA: 3-letter IATA code (identifier).
— Capacity: Seating capacity (empty for cargo aircrafts).
— Country: Country or territory where aircraft maker is incorporated.
— Airlines (airlines.txt)
— Name: Name of the airline.
— IATA: 2-letter IATA code (identifier).
— ICAO: 3-letter ICAO code, if available.
— Callsign: Airline callsign.
— Country: Country or territory where airline is incorporated.
— Airports (airports.txt)
— Name: Name of airport. May or may not contain the City name.
— City: Main city served by airport. May be spelled differently from Name.
— Country: Country or territory where airport is located.
— IATA: 3-letter IATA code (identifier).
— ICAO: 4-letter ICAO code.
— Latitude: Decimal degrees, usually to six significant digits. Negative is South, positive is North.
— Longitude: Decimal degrees, usually to six significant digits. Negative is West, positive is East.
— Altitude: In feet.
— Timezone: Hours offset from UTC.
— DST: Daylight savings time. One of E (Europe), A (US/Canada), S (South America), O (Australia),
Z (New Zealand), N (None) or U (Unknown).
— Distance (distance.txt)
— Airport1: IATA 3-letter code for airport #1.
— Airport2: IATA 3-letter code for airport #2.
— Distance: Distance in kilometers between airport #1 and airport #2.
— Schedule (schedule.txt)
— From: Departure airport (3-letter IATA code).
— To: Arrival airport (3-letter IATA code).
— Valid-From: Begining of the schedule validity (2-Jan-2009 if not specified).
— Valid-Until: End of the schedule validity (30-Jan-2009 if not specified).
— Days: Days of operation (1=Monday, 2=Tuesday, 3=Wednesday, 4=Thursday, 5=Friday, 6=Saturday, 7=Sunday).
— Departure: Departure time (hh:mm).
— Arrival: Arrival time (hh:mm or hh:mm+N, +N=Arrival N days later).
— Flight: Flight information (2-letter IATA airline code + flight number).
— Aircraft: 3-letter IATA aircraft code.
— Duration: Flight duration (hh:mm).
4
4.1
Instructions
Générales
Le projet sera à effectuer seul. Tel un projet de conception et de mise en œuvre d’un système d’information
« réel », celui-ci se fera en 4 phases, par raffinements successifs, de l’abstrait au concret :
1. analyse conceptuelle ;
2. conception logique ;
3. conception physique ;
4. mise en œuvre.
Chaque phase du projet donnera lieu à un rendu sous une certaine forme à une certaine date, ces éléments
étant précisés pour chacune des phases dans les sous-sections suivantes. Chaque rendu sera noté et la note finale
du projet sera fonction des notes obtenues pour chacune des phases.
Je souhaiterais que le matériel demandé pour chaque phase soit rendu en temps et en heure ; aussi, j’appliquerai une pénalité d’un point par jour de retard sur toute note de phase concernée par un retard de rendu.
4
Cependant, en cas de problème concernant les dates limites de rendu, n’hésitez surtout pas à me contacter pour
que nous puissions y trouver une solution.
Tout le matériel sera à me faire parvenir par courriel, à l’adresse [email protected] ; je
tâcherai d’accuser bonne réception du matériel demandé (dans le cas contraire, n’hésitez pas à me relancer).
4.2
Analyse conceptuelle
Il s’agira de rendre un rapport devant contenir au moins tous les éléments qui suivent.
1. Votre schéma conceptuel (EA) final. Je vous prierais de respecter, dans la mesure du possible, le formalisme graphique vu en cours ; cependant, je ne vous impose l’utilisation d’aucun outil de dessin graphique
ou atelier logiciel particulier, un schéma manuscrit numérisé — en autant qu’il soit lisible et complet
— convenant tout aussi bien. Un outil qui me semble adapté et que pourraient utiliser ceux qui le souhaitent est DB-MAIN.
2. La documentation du schéma, telle que définie dans le cours.
3. La liste des contraintes d’intégrité additionnelles, telle que définie dans le cours.
4. Une explication et justification des choix de conception qui le nécessitent. Il s’agit, en clair, de tous les
choix que vous jugerez non évidents (constructions complexes, redondances, etc.).
5. Une description, en langage courant, des différentes fonctionnalités du système d’information. En se
basant sur le schéma conceptuel, il s’agit, concrètement, de décrire les opérations que pourront faire à
tout moment les différents utilisateurs sur la population globale d’entités et d’associations.
À rendre au plus tard le vendredi 18/03/2016 à 23h59.
4.3
Conception logique
Il s’agira de rendre un rapport devant contenir au moins tous les éléments qui suivent.
1. Votre schéma EA restructuré, tel que défini dans le cours.
2. Une explication et justification des choix de restructuration qui le nécessitent. Il s’agit, en clair, de tous
les choix que vous jugerez non évidents (choix de transformation, ajouts d’identifiants primaires, etc.).
3. Votre schéma logique relationnel final. Je vous prierais, à nouveau, de respecter, dans la mesure du
possible, le formalisme graphique vu en cours ; comme dans le cas du schéma conceptuel, je n’impose
aucune autre contrainte, si ce n’est que le schéma doit être lisible et complet.
4. La liste des contraintes d’intégrité additionnelles, exprimées sur le schéma logique cette fois-ci (en faisant
attention au fait que certaines contraintes peuvent avoir été nouvellement ajoutées suite à la traduction
du schéma conceptuel vers le schéma logique).
À rendre au plus tard le vendredi 01/04/2016 à 23h59.
4.4
Conception physique
Il s’agira de rendre un rapport devant contenir au moins tous les éléments qui suivent.
1. La liste des index à définir sur la base de données, avec justifications. Les justifications doivent prendre
en compte le coût et la fréquence des différentes requêtes qui seront faites sur la base de données.
2. Pour chaque contrainte d’intégrité additionnelle, une description indiquant comment elle sera mise en
œuvre. Parmi les différentes possibilités, on trouve l’utilisation de déclencheurs, de procédures stockées,
de procédures au niveau applicatif, ou encore un mélange de tout cela.
3. Pour chaque fonctionnalité du système d’information, une description indiquant comment elle sera mise
en œuvre. Ceci inclut à la fois les opérations que pourront faire à tout moment les différents utilisateurs
et la gestion de ces derniers ; tout cela pouvant être implémenté à travers des procédures stockées, des
transactions ou encore la conjonction de requêtes exécutées au niveau applicatif.
4. Une description de la structure de l’application web. Cela peut par exemple prendre la forme d’un schéma
indiquant les différentes pages, leurs rôles respectifs, ainsi que les liens qui les relient.
À rendre au plus tard le vendredi 13/05/2016 à 23h59.
5
4.5
Mise en œuvre
Il s’agira de rendre une archive devant contenir au moins tous les éléments qui suivent.
1. Le code SQL permettant de créer la base de données.
2. Le code SQL mettant en œuvre les contraintes d’intégrité additionnelles, mais aussi les différentes fonctionnalités du système d’information.
3. Les scripts et le code SQL permettant d’insérer les données de test dans la base de données.
4. Le code de l’application web.
5. Un court fichier README indiquant comment utiliser l’application web. Ceci inclut notamment un lien
vers une version « en production » de votre système d’information (vous pourrez l’héberger sur le serveur
tpbdd que vous utilisez pour les TP de bases de données) et toutes les informations nécessaires pour s’y
connecter, pour chacun des types d’utilisateurs.
À rendre au plus tard le vendredi 20/05/2016 à 23h59.
4.6
Présentation finale
Il s’agira d’une courte présentation individuelle informelle de 10–15 minutes me permettant de voir votre
système d’information en action et éventuellement de vous poser quelques questions.
Prévue le vendredi 27/05/2016.
6