Project Name: Web Application Firewall using Machine Learning
Phase 1: Designing and Building Web Proxy
Objective: The proxy will act as gateway to automatically or manually forward and drop un-secure traffic. At the initial phases, the proxy should have a list of all secure links/page to the target portal. Then, a list of malicious input/traffic will be stored in the database in order to engage Machine Learning on later phases.
The proxy should have ability to list all request and response details such as IP, Host, payload, parameters, etc. This could be in case the administrators would
... Read more like to investigate more on the HTTP request.
Details: The proxy will contain two main phases:
o Auto Learning Phase: The aim of this phase is to crawl/Spider the target portal automatically and understand its structure. The crawling process will list all portal’s URLs and folders. This will be achieved automatically by the proxy by doing deep scanning for the target: The output should be like this:
https://target.com/admin/login.php
https://target.com/admin/resetpassword.php
https://target.com/home/
https://target.com/news?id=1
All the captured and generated data should be stored in Database.
The following should be stored for each page:
- Link (e.g.: …/News.php?id=1, …/forum.php, etc.)
-List of inputs of the page that appear in URL or in the page itself.
- Minimum byte: describing the minimum byte within the inputs.
- Maximum byte: describing the Maximum byte within the inputs.
-Mean byte: A feature describing the mean byte value of the input character values.
- Standard deviation byte: This feature define the standard deviation within the input.
- Previous page: This features is to record the sequence of the current page. In other words, what is the previous page or link?
The database could contain duplicated entries for each page, which is fine, as that will be as cleaned on the next phases.
However, the system should contain a separate table to store only a unique page of the portal.
O Manual/Monitoring Learning Phase: The second way to build our database is to allow the owner of the portal or their customers to access the portal and navigate manually to all website’s contents. This access will include testing all portal inputs and fields. Thus, the client will be asked to fill forms, submit comments, send application, upload file, and test all portal components as any other normal users.
In this phase, the proxy should clearly identify all the portal inputs either in the URL or in page request. Also, the proxy should understand to normal acceptable inputs that could be received by remote users:
1. Is the input parameter integer?
2.
Is the input parameter string?
3. Is the input parameter integers and strings?
The application should identify all the inputs that normally user are filled. The ones that almost static and user should not touch them are stored and recorded.
Phase 2: Building Database with un-secure inputs:
Objective: This phase will be responsible to fill the database manually with a list of un-secure inputs such as xss, sql injection, command injection. The data was mostly found from different github repositories.
Proxy Dashboard
The Dashboard should contain the following:
1. Adding Target portal: This window should contain, at least, the following:
a. Target Name
b.
Target url or ip address
c. Target port
d. Hosting environment Operating System.
This feature should ask the user if is willing to start the automated scanning for the target portal (crawling).
2. Logs Window: This windows is responsible to display all the connections request that initiated from client toward the target portal. It should display the logs in organized rows and each row should be colored to either Green, or Orange, or Red:
a. Green: The request is classified as safe.
b. Red: The request is classified as Malicious.
c. Orange: New request, not classified yet.
Admin can click on each one to display more details such as:
d. Header details
e.
Client IP address
f. Host Details
g. Request and response details
h.
Timeframe
i. Classification of the request (Malicious | safe)
The Admin user can manually do the following action to each row:
a. Change the classification of the request to Malicious or Safe
b.
Send this request to “ToDo” list, for further investigation.
c. Get statistics of this requested URL.
d. Display Graph representation that shows the page and its details (time, how many each day, how many sessions, etc.)
3. Security Logs: This section is responsible to list only the security alarm and malicious requests. The admin will spend more time here to figure-out the whole picture of the accessibility to the portal. The requests with malicious behavior (with over 99% of accuracy) will be blocked automatically and the database will be updated accordingly. View less
Delivery term: Not specified