With technological advancements, the responsibility to provide users with safe browsing and internet services has increased. Every time we surf the internet, there’s a high risk of coming across malicious links and websites that might corrupt our systems or exploit our personal data. To protect the users from such unsafe sites and links, Google Cloud Platform developed a service called “Web Risk”.
Let’s see how Web Risk helps provide a secure and protected browsing experience.
Web Risk
Google Cloud provides this service called “Web Risk”, which allows the client applications to verify the URLs.This checking is done against Google’s lists of unsafe and insecure web resources, and this list is constantly updated.
Social engineering sites, such as deceiving and phishing sites and sites that house dangerous or unwanted software are examples of unsafe web resources. Web Risk allows you to instantly detect known hazardous sites, alert visitors before they click on infected links, and block users from publishing links to known infected pages on your site. Web Risk contains information from over a million dangerous URLs and is kept up to date by scanning billions of URLs every day.
Users can be prevented from posting infected URLs to your site with the help of web risk. Web risk also prevents people from sharing malicious links and shows them warnings before they visit any unsafe site.
Web Risk is a flexible API-based service that can be easily integrated into your apps.
Features of Web Risk
Web Risk has many features, let’s look at them one by one to understand more about this service.
Detailed List of Known Safe URLs
Use Google's continuously updated listings of dangerous web resources to discover phishing and fraudulent websites, as well as sites that host malware or unwanted software.
Application Agnostic
Web Risk can be used with cloud-hosted applications and websites as well as on-premises hosted applications and websites.
Lookup API
The lookup API enables your client applications to send requests to see whether any of the posted URLs are on any of the known dangerous lists. If a known problematic URL is discovered, it is marked and returned.
Update API
Download hashed versions of the unsafe lists for storing in a local database using the update API. If a match is found in the local database, the client can submit a verification request to the Web Risk servers to confirm that the URL is on the unsafe lists.
Detect Malicious URLs
You can use either Lookup API or Update API to detect malicious URLs. Both of them provide the same information, then the question arises, which one to use?
Using Lookup API is the easiest. Using this, you will be able to query Web Risk for all the URLs you wish to check.
The Update API is more difficult, but it has several appealing features.
You will keep a local database using the Update API.
This database can be used to determine whether a URL is malicious.
This database performs the function of a bloom filter. That is, while there may be some false positives, by this, we mean that a URL that is not malicious might appear to be malicious, there should be no false negatives, that is, a URL that is not identified to be malicious but actually is.
As a result, the Web Risk servers are only visited infrequently and only to confirm matches and disambiguate false positives. In most circumstances, while utilizing the Update API to check a URL, you will not need to contact the Web Risk servers at all.
You should only contact Web Risk servers when updating the local database or confirming that a URL is hazardous.
So, if you want a quick setup and easy results, use Lookup API, and if you need lower latency use Update API.
Access Control with IAM
IAM stands for Identity and Access management. Google Cloud offers IAM that lets you give access to certain GCP resources and prevents unrequired access to other resources.
Access control for Web Risk API is provided using VPC service Controls by Web Risk service. Administrators can use VPC Service Controls to build a service perimeter around resources of Google-managed services in order to limit communication to and between those services.
To call any APIs, Web Risk does not require any additional permissions. This means that service accounts with no IAM roles can be used.
Hashing URLs
The Web Risk lists are made up of SHA256 hashes of varying lengths. Clients must first compute the hash prefix of a URL (Uniform Resource Locator) before checking it against a Web Risk list, either locally or on the server.
Follow these procedures to determine a URL's hash prefix:
Canonicalize the URL according to the instructions in Canonicalization.
Create the URL's suffix/prefix expressions as stated in Suffix/Prefix expressions.
As discussed in Hash calculations, compute the full-length hash for each suffix/prefix expression.
As detailed in Hash prefix computations, compute the hash prefix for each full-length hash.
It should be noted that these processes match the process used by the Web Risk server to maintain the Web Risk lists.
Canonicalization
To start with canonicalization, we presume the client has processed the URL and validated it in accordance with RFC 2396. If the URL contains an internationalized domain name (IDN), the client should convert it to ASCII Punycode. The URL must have a path component, such as a leading slash (http://google.com/).
Remove the tab (0x09), CR (0x0d), and LF (0x0a) characters from the URL first. Remove escape sequences for some characters, such as %0a.
Second, if the URL contains a fragment, eliminate it. For example, http://google.com/#frag can be shortened to http://google.com/.
Third, delete percent-escapes from the URL until there are no more percent-escapes.
Hostname Canonicalization
Take the hostname from the URL and then do the following:
Remove all preceding and following dots.
Replace many dots with a single dot.
Normalize the hostname to four dot-separated decimal values if it can be parsed as an IP address. Any lawful IP address encoding, including octal, hex, and less than four components, should be handled by the client.
The entire string should be lowercased.
Path Canonicalization
Replace /./ with / and remove /../ along with the previous path component to resolve the sequences /../ and /./ in the path.
Replace multiple slash characters with a single slash character.
These path canonicalizations should not be applied to query parameters.
Percent-escape all characters in the URL that are = ASCII 32, >= 127, #, or %. Uppercase hex characters should be used for escapes.
Submission API
This section discusses how to submit URLs that you feel are dangerous to Safe Browsing for analysis and then asynchronously check the findings. Any URLs found to be in violation of the Safe Browsing Policies are added to the Safe Browsing service.
Send an HTTP POST request to the projects.uris.submit function to submit a URL.
The Submission API allows for only one URL per request. To check several URLs, make a separate request for each URL.
The URL must be correct, but it does not have to be canonicalized.
The HTTP POST response indicates a lengthy operation.
Web Risk Advisory
Google compiles lists of potentially hazardous websites using automatic algorithms and user comments. Social engineering, malware, and unsolicited software pages are the three most common forms of harmful pages on lists. Web Risk provides developers with lists of potentially harmful pages.
These lists cannot protect users against every harmful site on the internet, and there is always the possibility that a safe site will be misdiagnosed as risky, but Google Cloud updates the lists on a regular basis to keep them up to date as possible.
Phishing and Deceptive sites
A social engineering assault occurs when a web user is duped into doing something risky online. Social engineering content can be provided on a website or through embedded resources such as photos, advertising, or other third-party components.
Phishing is an assault in which a website requests personal or financial information from you under false pretenses.
Malware
Malware is a sort of software that tries to steal your personal information or use your computer in ways you did not want. Malware pages are online pages that contain malicious code that can be downloaded and installed on your computer without your permission.
Unwanted Software
Google's Unwanted Software Policy expands on those broad recommendations by outlining a set of fundamental criteria for web-friendly software. Software that breaches these principles may be detrimental to the user experience, and we will take precautions to protect users from it.
Frequently Asked Questions
What are the services and applications that Web Risk integrates with?
Web Risk provided by Google Cloud integrates with Google Cloud Platform (GCP) and Pesofts.
What is the pricing of Google Cloud Web Risk?
It starts at $50 per 1,000 calls per month, which is subject to change according to the company.
Does Web Risk provide any support to its users?
Yes, Web Risk provides online support to its users.
What languages does Web Risk support?
Google Cloud’s Web Risk only supports English.
Does Web risk provide any training?
It provides training only in the form of documentation as of now.
Conclusion
In today’s day world, where web attacks and cyber crimes have increased, we saw how Google Cloud’s Web Risk serves its users by safe and protected internet surfing. We had a look at what the service is all about, its features, look up API, update API, the process to detect malicious links and sites, access control, hashing, canonicalization, submission API, and Web Risk Advisory. We hope you learned something new today.
To read more about Google Cloud and its services refer to the following articles:
Please refer to our guided pathways on Code studio to learn more about DSA, Competitive Programming, JavaScript, System Design, etc. Enroll in our courses, and use the accessible sample exams and questions as a guide. For placement preparations, look at the interview experiences and interview package.