Tuesday, June 27, 2006

Long time ago, when i discussed with Kaspersky's staff usage of Bayesian approach to catch spam, they spoke that it's inappropriate to use, and "one good thing to use" = usage of external spam databases.
But in last version of Kaspersky Internet Security, they announced, that they use new technology to spam catching - based on the Bayesian approach ;-)

Monday, June 26, 2006

web filtering with ICAP

Some time ago i had evaluated possibility of ICAP-server implementation in our Web filtering product. It's good idea to split into the communication & filtering parts, but there are some things, that does not allow to effectively implement ICAP-based solutions:
  • ICAP-client must be smart enougth to make some content filtering tasks before forwarding to ICAP-server, file type detection, for example
  • Sometimes, during filtering, filtering engine must have more information, that does not provided by ICAP-client
  • To provide URL-filtering service, connection to ICAP-server is overhead
Thus, usage of ICAP is possible only in environment, where content-filtering is not primary task, and where is good infrastructure of ICAP-enabled clients - Cisco Cache Engine, for example

Friday, June 23, 2006

Content security for Kids in Australia

Goverment of Australia wants to provide free content-filtering products for home computers, to provide filtering of content for kids (pornography-blocking). This news comes from News.com.
Some ISP's can provide such content filtering as a service.
Currently we also has negotiations with one biggest russian telecommunications provider, to offer content-filtering as a service for schools, companies & goverment organisations

Monday, June 19, 2006

forget to introduce myself

I forgot to introduce myself after creating this blog.
I'm working in the branch of content filtering and computer security for a 5 years. Currently i hold position of Head of software development department, that develop and supports 2 content filtering products - MailBoss (mail filtering) and WebBoss (web protocols filtering).
We make research in branches, tightly coupled with content filtering, such as - data format detection, language and encoding detection for the text fragments, information categorisation.
So, i hope, that my blog will interesting for you! Until the next post ;-)

Friday, June 16, 2006

voip filtering

Currently we looks for solutions for VoIP detection, capturing and analyse. We had found open sourced Oreka, which can detect and capture VoIP stream. But main task - how we can analyse voice, extract phrases from it, and get normal text to analyse?
Voice (also as image) recognition is interesting theme, but we need to have there are many ressources to model and implement this task.
Is anybody knows solutions for voice recognition?

about text filtering

There are several approaches to the text filtering:
- searching of concrete words in text files - it is very simple to implement, and quick for processor;
- searching for regexps - also simple, when using external libraries, but more slow than simple text search;
- there is also another approach to the getting text category - using combination of words with weights. Using positive and negative weights values, we can select data, that match given category. This method is harder to implement, but it still simple. But, exist one problem - how to calculate weights for given words? Manual update of weights and words is hard problem, so it method is not good for use;
- another approach to the getting text category is in using something like Bayesian statistical text classification - we can use automatic text analyzing tools, by comparing texts in several categories and extracting needed information - words, weights, etc. This methid is slower, than others, but it provide good results, together with automatical informatiion extraction.

In our products we use some of these methods to classify texts - users can select which method to use, depending on tasks, that they want to do.

Tuesday, June 13, 2006

about this blog

I had posted different things about content filtering in my personal blog, but now i decide to move all content-filtering related things in dedicated blog.
So, i pleased to present you new blog about content-filtering techniques. I'll write about content filtering of e-mail, web, etc. About technical and moral aspects of this. I'll try to make posts frequently.
You're welcome!