Impact Learning Implementation
Impact Learning is a robust machine learning algorithm, which is being used widely to achieve more advanced results on many machine-learning related challenges. Impact learning is a supervised machine learning algorithm for resolving classification and linear or non-linear regression knowledge from examples. A perfect python module is developed using the tensor concept to use Impact learning in regression or classification problems. The dependency of this module is Tensorflow 2.0, Sklearn, Scipy, Numpy. To install in your machine learning environment just command pip install impact_learning.
SeqVec: Sequence Vectorizer (Word Embedding)
Sequence Vectorizer is a word embedding method and equation that preserve every sequence impact of a word in sentence. There is a two way of this system. The first one is one direction sequence and the second is bidirectional sequence. Any type of sequence extraction is possible using Sequence Vectorizer. To install just command pip install SeqVectorizer
Direction Class Implementation
Direction class is a hybrid supervised machine learning algorithm for resolving classification problems from knowledge. It works by establishing a relationship based on cross-entropy using a hybrid approach of finding direction. A perfect python module is developed using the tensor concept to use Impact learning in regression or classification problems. The dependency of this module is Tensorflow 2.0, Sklearn, Scipy, Numpy.
Doctor Apaa: An AI Based Bangla Virtual Chatting System
The coronavirus emerged in only December last year, but already the world is dealing with a pandemic of the virus and the disease it causes – Covid-19. When Corona has taken horrible forms and locked down almost everywhere, we should have something that will work for all people, and the only thing that will make this doctor Corona work as an absolute friend to all people at this time. Where there is no doctor and if there is any doctor but, they do not contact patients in fear of Corona. There we can fix this problem by our program Dr. Apaa. Which is made by Artificial Intelligence, Deep Learning, Natural language processing, Bangla speech recognition, and Bangla chatbot. it can analyze your text and convert it to its environment, then predict the risk of Coronavirus according to your risk, it also can forecast a good routine for your life. Anyone can always share their feelings, it tries to make always happy saying jokes, singing songs. You can also find any police helpline or your nearest hospital information. It can tell the updated situation of the corona. Generally, this AI program can be a good friend and the support of a person in-home quarantine. The way a doctor talks to a patient is to discuss his illness and make a decision. This program will work in exactly the same way for determining corona risk. It has good sentiment, understanding the emotions. A web interface has been developed, to get in anyone can visit: https://covid.moonpi.tech/
BnSentiment: Bangla Sentiment Analysis
Sentiment Analysis (SA) is the process of extracting the sentimental level of someone’s observation, evaluation, or opinion on different social aspects such as products, services or individuals, etc. It has become a popular research area in the field of Natural Language Processing (NLP) for enabling a machine to recognize the right sentiments of texts. Lately, there have been many works on SA focusing on English or other languages, but comparatively less work is notable in the field of Bangla. In this work, we present the deep learning and machine learning approaches for the binary sentiment analysis, then compare among all algorithms using some statistical procedures to find out the best performance techniques for Bangla binary sentiment from texts. This work also provides a pre-trained fine-tuned model named BnSentiment to be used by the researchers as well as individuals who want to automate the detection of sentiment polarity on their user review system as well. This module makes strong the Bangla NLP community for further research. The model has an accuracy of 84.5% which has been trained on a sufficiently large dataset. To install this system just command pip install BnSentiment
VecLemma: A Hybrid Lemma Approch
VecLemma is one kind of hybrid technique to convert every text to its lemma. The concept of this algorithm is based on vector relation and it is non-parametric statistical learning. Firstly it converts every word in some computational linguistic manner. Then it justifies the relation of the vector. Thus it can find out the best lemma. This algorithm is developed based on English 42k lemma & a perfect pre-trained module has been developed. To install this system just command pip install veclemma. (In progression)
Ekushe: The Bangla Language Toolkit
The Bangla Language Toolkit, or more commonly Ekushe, is a suite of Artificial Intelligence libraries and programs for symbolic analyzing for Bangla language written in the Python programming language. We have used machine learning, deep leering, mathematics, and statistical techniques in order to develop Ekushe. The main purpose of this tool is to perform computational research on Bangla language so that the Bangla language can keep its value in the world of AI. This tool can be significant for the business, economics, E-commerce, office, bank, and other sectors. Till we have developed various computational linguistics techniques on Bangle Language such as Chatbot, Lemmatization, Spelling Checker, Semantic Analysing, Topics Modeling, etc. (Under developing). To install in your machine learning environment just command pip install ekushe.
The names of people have a large significance in various types of computing applications. In general, people’s names usually have a potential distinction between genders. Detecting genders from names with higher accuracy could be very challenging for Bangla character-based Bangla names. A successful python pre-trained module has been developed using machine learning & deep learning approach based on 22k sized Bangla real name dataset. To install this module just command on your AI environment pip install bangla-linga.
BnLemma: Bangla Lemmatization Framework
Natural language processing (NLP) finds enormous applications in automation, while lemmatization is an essential preprocessing technique for simplification of a word to its origin-word in NLP. However, there is a scarcity of effective algorithms in Bangla NLP. This leads us to develop a useful Bangla language lemmatization tool. Usually, some rule base stemming processes play the vital role of lemmatization in Bangla language processing for the lack of any lemmatization tool. In this paper, we propose a Bangla lemmatization framework as well as three effective lemmatization techniques based on data structures and algorithms that are used to develop the framework. We have used Trie algorithm based on prefixes in Bangla language and developed a mapping algorithm named “Dictionary Based Search by Removing Affix (DBSRA)” whereas the Levenshtein distance gets the priority of comparison together with unknown words lemmatize than lemmatize of known words. Finally, we have done experimentation for Bangla language lemmatization, and our developed DBSRA confirms better performance in comparison to other algorithms. To install in your machine learning environment just command pip install BnLemma.
BnFeatureExtraction: Bangla Feature Extraction Framework & Python Pre-trained Module
A word embedding is a learned representation for text where words that have the same meaning have a similar representation. It is this approach to representing words and documents that may be considered one of the key breakthroughs of deep learning on challenging natural language processing problems. There is a lack of feature extraction from Bangla word. For this purpose, we developed a python module based on training system and pre-trained word embedding. There are used six high performance techniques to develop this framework. The techniques are Counter vectorizer, Hash vectorizer, TF-IDF vectorizer, word2vec, Fast-Text, Glove. The pre-trained is developed based on “The Daily Prothom Alo” newspaper covering 9 years (2010-2019) including all covered news. To install this module in your computational linguistic environment, just follow the command pip install BnFeatureExtraction.
Bangla POS(Parts of Speech) Tagger
In corpus linguistics, part-of-speech tagging, also called grammatical tagging is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition and its context. Bangla POS Tagger is a powerful tool for annotating parts of speech of any Bangla sentence. There is a pretrained model available with this package, trained with viterbi algorithm. https://github.com/Kowsher/Bangla-NLP/tree/master/Bangla%20POS-Tagger
Bangla Regular Expression
A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. But it’s hard to develop a Bangla linguistics toolkit on Bangla Regular Expression. It provides all kinds of regular expression and the scenario on Bangla language. This tool is developed from scratch on Python. To install follow the command pip install bre.
Bisect Decent is a faster and dynamic algorithm to find the cost function with the lowest time by dividing the loss function into two segments. There is no required learning rate. This algorithm has a sense & strategy to find out the loss segment relying on its loss value. This algorithm has been developed based on computational mathematics & numerical method. The main drawback of this approach is the local minima. (In progression).