Open issues regarding legal metadata: IP licensing and

Transcription

Open issues regarding legal metadata: IP licensing and
Open issues regarding legal
metadata:
IP licensing and management of different
cognitive levels
FLORENCE
MAY 6th, 2011
Danièle Bourcier
Meritxell Fernández-Barrera
1
Cersa CNRS-Université Paris 2, Paris
State of the art and current trends in access
to legal information
Legal metadata: the concept
 Metadata:
data about data
 Legal metadata: data about legal
data: enable access to data
 Typology:
 Legal indexes
 Legal thesauri
 Lexical or lightweight legal
ontologies
 Formal legal ontologies
3
Legal metadata: developments

90’s up to now:
 Development of core and domain legal
ontologies
 Formal and lexical ontologies
 Interoperability: Mapping and alignment
 Methodologies: manual, bottom-up, middleout
 Applications: semantic information retrieval,
reasoning, cross-lingual information retrieval,
legal drafting, …
4
New ways of consuming legal information
Linked legal data
Internet of legal
services (Apps)
Linking Open Data cloud diagram, by Richard
Cyganiak and Anja Jentzsch. http://lod/cloud.net/
5
New challenges…
Adressing a wide typology of users
I was fooled by a seller
I want my money back
Consumer justice
The seller did not behave
correctly
Is the seller in breach of
contract?
concepts
IP and legal metadata
Legal metadata
Annotated legal data
Raw legal data
LRI-Core Graph (Leibniz Center for Law)
IP regime (Copyright, CC, public domain) determines:
Who can reuse?
What type of reuse? (Commercial/non-commercial; derivative work)
Towards a repository of annotated legal metadata?
8
Our experience…
Legilocal project

French research project: private and public partners (2 research
centres; 4 private firms)

Goal: enabling online access to legal information produced by local
institutions in France (department, region, municipality)
101 Departments
27 Regions
36,682 Municipalities

Currently:


Lack of standards regarding both file formats and type of local legal
sources made available online
Difficult access by citizens to pieces of local regulation
10
Legilocal architecture
Widgets in local
websites
Local databases
Database containing XML
annotated documents
Metasearch engine
Social network
11
IP issues: working out a suitable business model
 IP
over (Commercial reuse? Share alike?
Derivative work?):
Database of local public data
 Manually crafted legal metadata (OWL
ontologies)
 Legal metadata produced with the aid of NLP
tools (terminologies)
 XML annotations of documents

 Balance
between different stakeholders in
Legilocal: public administration, research
centres, private firms
12
Legilocal lexico-semantic resource

Different cognitive levels:
Legislative
 Legal professionals
 Citizens


Methodology
13
Case study: noise regulation

Construction of 3 databases with MySql and php:
 Regulation: national + local
 Case law: Conseil d’État+ Cour de Cassation
 Citizens’
complaints (from CIDB and
Prefecture de Paris)+ interviews to public
officers dealing with noise cases
14
Challenges of user-generated texts
handwritten  Transcription
 Lack of terminology harmonisation
 Mostly
[…] un problème très gênant : nous entendons les voisins. Il
est vrai surtout lorsque tout est calme, le soir, mais c’est : bruits
de voix, bruits de pleurs, bruits d’eau qui s’écoule dans l’évier,
bruits d’impacts, bruits de pas sur le carrelage, dans les
escaliers, à l’étage.
Le jour cela nous est complètement égal. Mais la nuit il est
impossible de dormir. Avec ce ronronnement perpétuel qui est
fort (boule Kiess, double vitrage rien n’y fait) à votre avis quels
sont nous droits ? Il est dit que nous ne sommes pas en centre
ville.
Noise producer
Noise intensity
Noise source
Location
Noise duration
15
Conclusions and further work
 New
ways of consuming legal information
(apps, Web 2.0) introduce new requirements
regarding the production and reuse of legal
metadata
 Through the Legilocal project we are exploring
IP issues and the distance between usergenerated content and legal expert knowledge
 IP models should ensure free reusability by
public institutions
 Term extraction from user-generated corpora
 Semi-automatic
mapping
between
usergenerated content and legal expert content
16