Monthly Archives

February 2018

Entity Extraction On a Website | AWS Comprehend

By AWS, Comprehend, Development, PythonNo Comments

Use Case

You want to better understand what entities are embedded in a company’s website so you can understand  what that company is focused on.  You can use a tool like this if you are prospecting, thinking about a partnership, etc.  How do you do this in the most efficient way?  There are some tools that have made this a lot easier.

1. Select Your Target

Here are the steps that we used for http://www.magicinc.org.  They are a simple squarespace site.  You can see this by checking out https://builtwith.com/magicinc.org

2. Get the data

For entity extraction, raw text is the goal. You want as much as you can get without having duplicates.  Here is how you can pull everything that you need.  Here are some command line arguments to run on a Mac.

  1. For the domain you want to search, change directories to a clean directory labeled YYYYMMDD_the_domain.
  2. Run this command: wget -p -k –recursive http://www.magicinc.org
  3. cd into the ./blog directory.
  4. Cat all of the blog articles out using this recursive command: find . -type f -exec cat {} >> ../catted_file “;”

3. Prep Query to an Entity Extraction Engine |  Comprehend

In this simple case, we are going to query a AWS’s Comprehend service.  We will need to write some simple Python3 code.

Since we can’t submit more than 5000 bytes, we need to submit a batched job that break’s up our raw text into simplified batch text.   To do that, I wrote some very simple code:


temp = open('./body_output/catted_file', 'r').read()
strings = temp.split(" ");
counter = 0;
aws_submission = "";
submission_counter = 0;
aws_queued_objects = [] for word in strings:
pre_add_submission = aws_submission
aws_submission = aws_submission + " " + word
if len(aws_submission.encode('utf-8')) >5000:
submission_counter = submission_counter+1
print ("Number = " + str(submission_counter) + " with a byte size of "+\n"+
"+ str(len(pre_add_submission.encode('utf-8'))))
aws_queued_objects.append(pre_add_submission)
aws_submission = ""

Now,  we have to submit the batched job.  This is very simple, assuming that you have your boto3 library properly installed and your AWS configs running correctly.

response = client.batch_detect_entities(
TextList=aws_queued_objects,LanguageCode='en')

Analyze

Now…. all you have to do is visualize the results.  Note, you need to visualize this result outside of the Comprehend tool because there is no way to import data into that viewer.  This snapshot is what it looks like.

More importantly, the key work is to analyze.  We will leave that up to you!

 

Source Code

It was made to be as simple as possible without over complicating things.

Github: https://github.com/Bytelion/aws_comprehend_batched_job

 

Secure DIL Environment Login with Xamarin.Auth SDK

By Development, Mobile, XamarinNo Comments

Previously in this blog series I have defined what a DIL environment is and I have described some of the key technical problems a DIL environment imposes on a mobile application that relies on web services for data and other functionality. Now it is time to begin looking at implementing specific solutions to some of these problems. In this article I will focus on solving the problem of how to implement a secure login for an application while in a DIL environment.

DIL User Login Authentication Sequence

In a normal connected environment a user (User A) enters their name and password into the mobile application. The mobile app then sends those credentials to the backend web service for verification. If valid, the user is logged into the application and gains access to its resources. But what happens when the device is disconnected from the network? How can User A’s credentials be validated? The mobile application must be designed to support DIL login. This can be accomplished by securely storing user credentials locally whenever a new user on a device successfully logs in to the application while the device and application is connected to its web service. Now that User A has already successfully logged in to the application on a specific device in a connected environment, User A can now login to the application on that same device when it enters a DIL environment.

What if another user (User B) also wants to login to the application on the same device in a DIL environment, except User B has not previously logged in on that device when it was connected. Unfortunately User B will be unable to login, even if she/he has valid credentials. It is impractical to store all valid user credentials for the application locally on a mobile device. The only way a user can login to the application on any given device is if they have previously logged in to the application while the device is connected. This sequence is illustrated in the chart below:

Xamarin.Auth SDK

What do we need in order to implement a DIL login? The first step is to find a way to securely store verified user credentials. The Xamarin stack includes an SDK that provides a simple and secure cross-platform solution for local user credential storage and user authentication. Xamarin.Auth also includes OAuth authenticators with built in support for identity providers including Google, Microsoft, Facebook, and Twitter. Additionally, Xamarin.Auth provides support for presenting the sign-in user interface. For more information on these features check out the official Xamarin developer documentation here. The aspect of Xamarin.Auth that we are going to focus on here is the secure local storage of user credentials.

Securely Store User Credentials

To make a DIL login possible a user must first have a successful login on the device while the device is connected. After the credentials provided by the user have been authenticated by the web service the verified credential data can then be passed to an Account object derived from the Xamarin.Auth SDK. The Account object can then be saved securely using the Xamarin.Auth AccountStore class. Below is an example of how this can be implemented.

The AccountStore class maps to Keychain services in iOS and KeyStore in Android. This makes it an excellent cross-platform solution for secure storage for verified user credentials that can be used to authenticate user logins when the device enters a DIL environment. The verified credentials stored locally through the AccountStore class can be retrieved and used to verify a DIL login as shown in a simple example below:

Once the user’s credentials are verified against previously authenticated credentials, the user can be allowed access to the application’s functionality and data. If the credentials cannot be verified against the locally stored credentials the user should be denied access.

Conclusion

In a DIL environment secure login is an issue that needs to be addressed. When developing applications using Xamarin, the Xamarin.Auth SDK contains an effective, efficient, and secure way to store verified user credentials across mobile platforms. That locally stored credential data can then be used to authenticate users that have previously logged in on a specific device when that device is offline. This gives users the ability to login and access application features at any time, regardless of network status.