EntityExtraction

Team project (Group 5) for extracting entities and their links

To run our project, please use the following commands:

Go to source directory:

	 cd FinalProject/

If you want to test with a small group of data (take(10)), please use command below:
```
 ./runnable_take10.sh
```

Run whole application with Standford Stack:

 ./run_standford.sh (This script will run your application in yarn cluster and use StandfordNER.jar to find entities => slower but more accurate than nltk chunk)

2') Or Run application with Nltk module (similar function to Standford Stack):

	./run_nltk.sh

*** NOTE ***:

The default input for run script is the WARC Record Id and hdfs input file, which were mentioned by TA.
The output will be written to the default folder in hdfs. Please see log in run script to get the output directory which is named by current date
If you want to change WARC Record ID and Hdfs input file and output file, please use command below:
```
  ./run_standford.sh "WARC_Recrod_Id" "Hdfs file path" "Output Directory"
```
To view the output, please use commands below:

hdfs dfs -ls "Output Directory"
```
  hdfs dfs -cat "File in output directory"
```

Thanks - Regards, Team 5

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Performance Profiling		Performance Profiling
Relation Extraction		Relation Extraction
.gitignore		.gitignore
README.md		README.md
entity_linking_nltk.py		entity_linking_nltk.py
entity_linking_standford.py		entity_linking_standford.py
entity_linking_standford_take10.py		entity_linking_standford_take10.py
run_nltk.sh		run_nltk.sh
run_standford.sh		run_standford.sh
runnable_take10.sh		runnable_take10.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EntityExtraction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

ThrowMeForALoop/EntityExtraction

Folders and files

Latest commit

History

Repository files navigation

EntityExtraction

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages