Enterprise

What InfoFrames are?

Fast, powerful, scalable, and programmable library for storage and big data analysis

This is a library that provides APIs in C++ and Python for storing and processing Big Data. Data is stored in the compressed form of effective summaries that support advanced AI/ML algorithms. InfoFrames takes advantage of Apache Parquet data storage, expanding it with advanced data summaries and custom compression algorithms.

What InfoFrames are?

Fast, powerful, scalable, and programmable library for storage and big data analysis

This is a library that provides APIs in C++ and Python for storing and processing Big Data. Data is stored in the compressed form of effective summaries that support advanced AI/ML algorithms.

Highlights:

Fast C++ library
Python bindings
Native tensor data type
Unstructured data storage and summaries
Commercial license
On-demand training
Data processing and storage scalability
Fast time-to-value

What InfoFrames means for your business

More space

InfoFrames tensor summaries keep only the essential information that ML methods require, providing up to 30X storage savings.

More speed

Since effective ML training in the tested cases required only as much as 3% of original data, the training times rapidly declined and reached above 300X speed up in training ML, while maintaining the data accuracy.

More data

Saving space and speeding up data processing allows for more data, which means you can significantly cut costs.

Fast C++ library

Python bindings

Native tensor data type

Unstructured data storage and summaries

Highlights:

Commercial license

On-demand training

Fast time-to-value

Data processing and storage
scalability

Highlights:

Fast C++ library

Python bindings

Native tensor data type

Unstructured data storage and summaries

Commercial license

On-demand training

Data processing and storage scalability

Fast time-to-value

You can use InfoFrames for:

Large sets of high frequency sensors

Log analysis

Large sets of images – storing and analysing

Storing big data in efficient, compressed form

Pricing

The best way to run InfoFrames on-premises or in your cloud

Use cases

In the old paradigm, it would be hard to analyse a chosen subset of a very large image/video dataset based on a given feature. To do that, we would need this feature to be already included in the metadata, and if it is not, we would need to process all of the data to extract it. This creates the need for large processing and storage resources but also forces the teams which use data to think ahead of possible use cases and metadata, restricting their creativity.

With the new paradigm created by InfoFrames, we have the ability to quickly perform analytics over objects of interest (e.g. with the chosen feature) in very large image/video datasets. Moreover, we can also easily combine the search to include other attributes/dimensions or use an entirely new context that was not previously considered (meaning “the feature does not exist yet”).

Via effective summaries, we can leverage existing/historical data to the fullest extent in light of new trends/information/context. This improves analytical and searching capabilities, being more powerful than, e.g. search engines, which can help only if we are interested in something already reflected by search indices.

Suppose a new person of interest (NPI) has been identified, such as a new important politician, celebrity, etc. In the old paradigm, how would it be possible to search the existing repository of images and videos to identify historical data containing that NPI? What steps would have to be taken to achieve this?

Clearly, a new index for that NPI would have to be created, which would require:

  1. collecting samples of annotated images of the NPI and training a new computer vision model,
  2. running a new computer vision job on the historical repository (predict) to obtain the indices,
  3. merging the outcomes of the indexing which was performed outside of the analytic database with the analytic database.


What if, every week many new people of interest are identified? We would have to repeat steps 1 and 2 every week, wasting time and computational resources.

Alternatively, using InfoFrames, the historical repository can be stored in a compressed, preprocessed form, allowing AI/ML training tasks to be done up to 300 times faster.

With InfoFrames, we can:

  1. quickly identify all images/videos in which a given person was present in the past,
  2. derive the most common behaviours and trends related to that person,
  3. better recommend the content for the users (for instance to new people).


Without InfoFrames, such behaviours and trends can only be partially derived from the existing metadata. Using only metadata makes it impossible to use the data source that stores the most crucial information – images and videos. Moreover, analysing them from an entirely new perspective is hard, as it would require new metadata. This does not reflect the current world in which trends are constantly changing.

Suppose a new phenomenon has been observed, for example:

  • a new viral/trend to build a DIY toy/object that has to be created/assembled from specific components, or
  • a new trendy shade of colour (e.g., every year a new “colour of the year”1 is selected).


In the old paradigm, how could an e-commerce company leverage knowledge about this phenomenon to better recommend products to their customers?

In a typical search engine, one cannot simply search for objects with a feature that has not been indexed before. One would first need to create a method (e.g. train a neural network) for this index and apply it. Creating this method is an analytical process, possibly requiring many iterations. This is why it is worth doing it on the in-database data, which are smaller and easier to manage.

With InfoFrames, we can quickly identify what products in the existing/historical database adhere to the given trend (for instance contain elements of the colour or components of interest) and be prepared to recommend such products to the user. Even if no products match a given trend explicitly, we can quickly derive some newly defined groups of products which might have the highest chance of being considered as matching it.

You can imagine a similar use case in one of the subfields of e-commerce, i.e. the fashion industry. The fashion industry is exceptionally vibrant. Assume a new pattern has become popular in clothing, for example, a flying elephant, or a new way of putting together wardrobe pieces. How to identify all products in a company’s database that contain such a trend to better recommend products for the customers? The solution also lies in InfoFrames.

Analysis of new datasets of images/videos in the context of historical data can provide more information about them. Whenever a new image/video is collected, it can be quickly compared with similar ones which are already stored. This way, the semantic information (metadata) of new datasets is richer than it would have been without the context of historical data.

Such an approach offers completely new possibilities when creating metadata. By using historical information, we can leverage more analogies and connections between newly collected and past data. This allows us to go beyond metadata that was initially defined for the dataset and create new contexts, especially for the behaviours, trends and phenomena that were not noticed before.

Suppose a new product is being offered for an online sale. In a new paradigm introduced by InfoFrames, we can compare it to past products using previously indexed metadata (as in the classical case) and compare it further using images/videos. This improves the product’s semantic information (metadata) and can be used to provide better automatic recommendations for the end user.

Even though we could theoretically compare the products solely based on precomputed metadata, the comparison dimension might not be relevant in a rapidly changing world of e- commerce. One example may be the appearance of entirely new categories of products because of various social media trends. Alternatively, we could also extract the groups/categories of old products to which the new product is most likely to match.

How to quickly and accurately assign a new lead or customer to a well-defined segment? For example, we can compare the image/video data that attract particular customers/leads on a web page with other images/videos that existing customers have viewed in the past.

Even if we lack complex information about the new visitors to our website, we can still accurately predict the market segment they belong to based solely on the type of content they interact with.

In a slightly different scenario, we can use the image/video data to recalculate/redefine the customer segments in general, if we suspect that the current segments do not sufficiently reflect the structure of customers. With that respect, the image/video data may be a source of information which is not represented by the currently stored metadata.

Suspicious events can be defined twofold:

  1. the events which follow a certain pattern (defined upfront at the metadata level)
  2. the events that are different from historical events (which is usually undescribable at
    the metadata level).


Detection of type 2 events within images and video is important because it can be connected with new threats (often followed by new business opportunities).

Similarly, we can imagine a situation where we want to detect suspicious phenomena in new images/video that might have occurred in the past, but with insufficient frequency. Due to the frequency, the corresponding aspects of metadata might not have been specified yet, but we can start measuring them with new data.

Every product on a production line is photographed for quality assurance. Suppose a new picture of a newly produced product has been taken. This product has a new type of defect. Without costly manual inspection, we can quickly analyse the pictures from the scene in the context of past claims to improve predictions regarding the probability of part failure. The same can be considered for the work of production lines and their video recordings

Suppose a new person of interest (NPI) has been identified, such as a new important politician, celebrity, etc. In the old paradigm, how would it be possible to search the existing repository of images and videos to identify historical data containing that NPI? What steps would have to be taken to achieve this?

Clearly, a new index for that NPI would have to be created, which would require:

  1. collecting samples of annotated images of the NPI and training a new computer vision model,
  2. running a new computer vision job on the historical repository (predict) to obtain the indices,
  3. merging the outcomes of the indexing which was performed outside of the analytic database with the analytic database.


What if, every week many new people of interest are identified? We would have to repeat steps 1 and 2 every week, wasting time and computational resources.

Alternatively, using InfoFrames, the historical repository can be stored in a compressed, preprocessed form, allowing AI/ML training tasks to be done up to 300 times faster.

With InfoFrames, we can:

  1. quickly identify all images/videos in which a given person was present in the past,
  2. derive the most common behaviours and trends related to that person,
  3. better recommend the content for the users (for instance to new people).


Without InfoFrames, such behaviours and trends can only be partially derived from the existing metadata. Using only metadata makes it impossible to use the data source that stores the most crucial information – images and videos. Moreover, analysing them from an entirely new perspective is hard, as it would require new metadata. This does not reflect the current world in which trends are constantly changing.

Suppose a new product is being offered for an online sale. In a new paradigm introduced by InfoFrames, we can compare it to past products using previously indexed metadata (as in the classical case) and compare it further using images/videos. This improves the product’s semantic information (metadata) and can be used to provide better automatic recommendations for the end user.

Even though we could theoretically compare the products solely based on precomputed metadata, the comparison dimension might not be relevant in a rapidly changing world of e- commerce. One example may be the appearance of entirely new categories of products because of various social media trends. Alternatively, we could also extract the groups/categories of old products to which the new product is most likely to match.

Suppose a new product is being offered for an online sale. In a new paradigm introduced by InfoFrames, we can compare it to past products using previously indexed metadata (as in the classical case) and compare it further using images/videos. This improves the product’s semantic information (metadata) and can be used to provide better automatic recommendations for the end user.

Even though we could theoretically compare the products solely based on precomputed metadata, the comparison dimension might not be relevant in a rapidly changing world of e- commerce. One example may be the appearance of entirely new categories of products because of various social media trends. Alternatively, we could also extract the groups/categories of old products to which the new product is most likely to match.

How to quickly and accurately assign a new lead or customer to a well-defined segment? For example, we can compare the image/video data that attract particular customers/leads on a web page with other images/videos that existing customers have viewed in the past.

Even if we lack complex information about the new visitors to our website, we can still accurately predict the market segment they belong to based solely on the type of content they interact with.

In a slightly different scenario, we can use the image/video data to recalculate/redefine the customer segments in general, if we suspect that the current segments do not sufficiently reflect the structure of customers. With that respect, the image/video data may be a source of information which is not represented by the currently stored metadata.

Every product on a production line is photographed for quality assurance. Suppose a new picture of a newly produced product has been taken. This product has a new type of defect. Without costly manual inspection, we can quickly analyse the pictures from the scene in the context of past claims to improve predictions regarding the probability of part failure. The same can be considered for the work of production lines and their video recordings

Do you have any questions? Please write to us

Do you have any questions?
Please write to us