About this project
it-programming / desktop-apps
Open
*Update: attached file "technicaldescription_and_pdfsamples" with pdfsamples and more detailed specification
the goal is to create a software/script to interpret/convert a complex pdf file into a text file (json). The script/software must be written in English and I must have complete access to the source files. ANY programming languages can be used (python, java, C/C++, ..
.).
There are several diferent types of pdf files and the software should be able to break the important information on the pdf file into a structured json file. So, the problem is to segmentate the information in the PDF files.
The PDF files are TESTs (math, science, informatics, etc) in portuguese. The ideia is to separate the important information so we can build a database with this information. The scope of this project is only to extract the informations from the pdf file and present in a json output.
The complexity is that the files are (very) different, many have figures and images that must be stored as well. I have a initial approach idea to solve the problem, but I'm open to discuss the problems and possible solutions as well. Detailed specification can be sent in case of interrest.
Im able to skype and explain everything.
The only skill needed is good programming skills and problem solving.
I'm very keen to help and discuss alternatives.
The job is not easy and price/time can be negotiate. I believe that good performance must be good rewarded ($$).
Thanks,
Category IT & Programming
Subcategory Desktop apps
Is this a project or a position? I don’t know yet
I currently have I have specifications
Required availability As needed
Experience in this type of projects Yes (I have managed this kind of project before)
Required platforms Windows
Delivery term: Not specified
Skills needed