Maria Marcano

Saturday, October 5, 2019

Moved to github

I'm moving to github and starting a new coding repo blog here https://mariangemarcano.github.io/. see you there!

Tuesday, March 17, 2015

I have been developing features on top of buildbot on a project that we internally called Katana, which is a custom version based on 0.8.7.p1. Here Na’tosha gives a very good explanation about the project.

Now, I’m going to quickly go over some of the features and provide a more detailed explanation on how to get started running katana.

Overall features

Some of the features and customizations we have developed for katana:

Multiple project support
Build status page filters the information by project / branches
Trigger builds and upload / download artifact dependencies
New UI design
Real-time data
Authentication AD/LDAP, etc

Get the code
git clone https://github.com/Unity-Technologies/buildbot.git src

git checkout katana (this is our stable branch)

Development environment

You can run katana in the same way as you do with buildbot (with a few differences):

Setup python virtual environment
virtualenv --no-site-packages katana-venv/
source katana-venv/bin/activate

Python modules
We are using the following modules

pip install -Iv SQLAlchemy-migrate==0.7.2
pip install -Iv SQLAlchemy==0.7.9
pip install python-ldap
pip install mock
pip install mysql-python

Setup the master and slave

pip install -e master

pip install -e slave

Check the running version

buildbot --version

buildslave --version

Run the tests

trial buildbot.test

trial buildslave.test

Create the buildmaster

buildbot create-master buildmaster

Change configuration file

mv master.cfg.sample master.cfg

Change html folder

Set the public_html folder to the one matching the resources (src)
c['status'].append(html.WebStatus(http_port=8010, authz=authz_cfg,

public_html="../src/www"))

Running the master

You should be able to see katana at http://localhost:8010/ after running:
buildbot start

You should be able to login to katana using pyflakes user and password, the default example configuration should look this:

authz_cfg=authz.Authz(
   # change any of these to True to enable; see the manual for more
   # options
   auth=auth.BasicAuth([("pyflakes","pyflakes")]),
   gracefulShutdown = False,
   forceBuild = 'auth', # use this to test your slave once it is set up
   forceAllBuilds = False,
   pingBuilder = False,
   stopBuild = False,
   stopAllBuilds = False,
   cancelPendingBuild = False,
)

If you sign in using pyflakes usr/password you will see Katana running

That's all, next time will bring more examples of using Katana.

Monday, September 17, 2012

Speech Recognition – Setting up sclite word alignment

Word alignment is used to measure the accuracy of a decoder. Sphinx tutorial references sclite from National Institute of Standards and Technology (NIST). This time I’m going to share some notes on how to run and setup sclite.

Running sclite

Sclite is a tool for scoring and evaluating the output of speech recognition by comparing the hypothesized text (HYP) output by the speech recognizer to the correct, or reference (REF) text. After comparing REF to HYP, (a process called alignment), statistics are gathered during the scoring process and a variety of reports can be produced to summarize the performance of the recognition system.

This is an example output using the files located src\sclite\testdata\tests.hyp and sclite\testdata\tests.ref:

Parameters:

The '-h' option is a required argument which specifies the input hypothesis file.
The '-r' option, a required argument, specifies the input reference file which the hypothesis file(s) are compared to.

sclite -h C:\Project\SpeechRecognition\CMUSphinx\3rdPartyLibs\sctk-2.4.3\src\sclite\testdata\tests.hyp -r C:\Project\SpeechRecognition\CMUSphinx\3rdPartyLibs\sctk-2.4.3\src\sclite\testdata\tests.ref

Setup sctk-2.4.0-20091110-0958.tar.bz2 on Windows 7

I downloaded Speech Recognition Scoring Toolkit (SCTK) which includes the SCLITE, ASCLITE, tranfilt, hubscr, SLATreport and utf_filt scoring tools.

I could compile it with gcc version 3.4.4, found in the following MinGW setup package mingw-get-inst-20101030.exe.

It was also necessary to modify 'src/rfilter1/makefile.in' and change the value of OPTIONS to be blank (as specified in the instructions)

The following compilation error is thrown when compiling using gcc version 4.6.2:

recording.h:122:29: error: 'Filter::Filter' cannot appear in a constant-expression
recording.h:122:36: error: template argument 2 is invalid
recording.h:122:36: error: template argument 4 is invalid
make[3]: *** [main.o] Error 1

Setup sctk-2.4.2-20120810-0938.tar.bz2 on Windows 7

Something similar happened with this version, I could compile it with gcc version 3.4.4.

The following compilation error is thrown when using gcc version 4.6.2:

In file included from asctools.h:23:0,
                 from asctools.cpp:22:
timeval.h:33:8: error: redefinition of 'struct timeval'
c:\mingw\bin\../lib/gcc/mingw32/4.6.2/../../../../include/winsock2.h:109:8: error: previous definition of 'struct timeval'
make[2]: *** [asctools.o] Error 1
make[2]: Leaving directory `/c/Project/SpeechRecognition/CMUSphinx/3rdPartyLibs/
sctk-2.4.2/src/asclite/test'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/c/Project/SpeechRecognition/CMUSphinx/3rdPartyLibs/
sctk-2.4.2/src/asclite'
make: *** [all] Error 2

This is thrown when compiling rfilter1

gcc  -o rfilter1 rfilter1.c
C:\Users\MANGEL~1\AppData\Local\Temp\ccuj1MU6.o:rfilter1.c:(.text+0x760): undefined reference to `strncmpi'
C:\Users\MANGEL~1\AppData\Local\Temp\ccuj1MU6.o:rfilter1.c:(.text+0x7c4): undefined reference to `strncmpi'
C:\Users\MANGEL~1\AppData\Local\Temp\ccuj1MU6.o:rfilter1.c:(.text+0x827): undefined reference to `strncmpi'
C:\Users\MANGEL~1\AppData\Local\Temp\ccuj1MU6.o:rfilter1.c:(.text+0x935): undefined reference to `strncmpi'
collect2: ld returned 1 exit status
make[1]: *** [rfilter1] Error 1
make[1]: Leaving directory `/c/Project/SpeechRecognition/CMUSphinx/3rdPartyLibs/sctk-2.4.2/src/rfilter1'
make: *** [all] Error 2

It is also possible to compile this with gcc version 4.6.2 after removing asclite tests and rfilter1 from make file.

After finishing the setup you are able to run sclite as described initially.

Resources:

http://www.speech.cs.cmu.edu/sphinx/tutorial.html
http://www.itl.nist.gov/iad/mig//tools/
http://www.nist.gov/itl/iad/mig/tools.cfm
sclite documentation ctk-2.4.3/doc/sclite.htm
http://www.voxforge.org/home/forums/message-boards/speech-recognition-engines/can-nists-sclite-be-used-on-windows

Sunday, September 9, 2012

Speech Recognition with PocketSphinx

The following post puts together information from different sources about speech recognition and gives a brief overview of the CMUSphinx project and how to get started with PocketSphinx on Windows.

CMUSphinx project is an open source speech recognition project developed at Carnegie Mellon University, which consists of various tools use to build speech applications:

CMUclmtk — language model tools
Sphinxtrain — acoustic model training tools

The following recognizers (decoders)

Pocketsphinx: Designed to be fast and for real time speed written in C, supports desktop applications and mobile devices. It needs the library Sphinxbase.
Sphinx3: Speed recognizer intended for researchers .
Sphinx4: Speech recognition written in the Java.

Let’s learn how HMM based speech recognition is handled: it functions by first learning the characteristics (or parameters) of a set of sound units, and then using what it has learned about the units to find the most probable sequence of sound units for a given speech signal. The process of learning about the sound units is called training. The process of using the knowledge acquired to deduce the most probable sequence of units in a given signal is called decoding, or simply recognition.

Setup Pocketsphinx on windows

Environment: Windows 7 and Visual Studio 2012, sphinxbase-0.7, pocketsphinx-0.7

Name the folders (sphinxbase / pocketsphinx ), the project pocketsphinx has external dependencies that use the relative paths like the following “..\..\..\sphinxbase\include\sphinxbase\ad.h”.

To test the installation let's run pocketsphinx_continuous.exe, this tool runs speech recognition both continuous listening from microphone and continuous file transcription. To run it requires:

Copy sphinxbase.dll to the build folder, for example C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\bin\Debug.
The parameter –hmm, the directory containing acoustic model files.
The parameter –lm, word trigram language model input file.
The parameter –dict, main pronunciation dictionary (lexicon) input file.

This is running the command with the information contained in the project.

pocketsphinx_continuous.exe -hmm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\hmm\en_US\hub4wsj_sc_8k 
-dict C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\cmu07a.dic 
-lm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\wsj0vp.5000.DMP

this is the output of me saying “no no no”

I will take a closer look at the project to check more the accuracy of the recognition.

Terminology

language model assigns a probability to a sequence of m words P(w1, .., w1) by means of a probability distribution. Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information retrieval. In speech recognition and in data compression, such a model tries to capture the properties of a language, and to predict the next word in a speech sequence.

HMM: (Hidden_Markov_model) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states.

Resources:

Tuesday, August 28, 2012

Buildbot Part IV: Customize the scheduler

My previous post was about customizing buildbot to display more information on the waterfall page. Now let’s customize the scheduler to trigger the knightly build only if the continuous one hasn’t failed.

One way to accomplish this is by adding a new scheduler called DependentNightly, this scheduler receives a new parameter called relatedBuilderNames containing a set of builders that the scheduler should watch. The startbuild method will now check, for every related builder, the last finished build status and start the knightly build if no failures are found. Bellow master.cfg changes.

from buildbot.schedulers.basic import SingleBranchScheduler
from buildbot.schedulers.timed import Nightly
from buildbot.changes import filter
from buildbot.status.builder import SUCCESS, WARNINGS, FAILURE
from twisted.python import log
from twisted.internet import defer

class DependentNightly(Nightly):

    relatedBuilderNames = []
    
    def __init__(self, relatedBuilderNames = [], **kwargs):
        Nightly.__init__(self, **kwargs)
        self.relatedBuilderNames = relatedBuilderNames

    def startBuild(self):
        startBuild = True
        for builderName in  self.relatedBuilderNames:
            builder_status = self.master.status.getBuilder(builderName)
            lastBuild = builder_status.getLastFinishedBuild()
            if lastBuild != None:
                startBuild = startBuild and (lastBuild.getResults() != FAILURE)

        if (startBuild):
            return Nightly.startBuild(self)
        else:
            log.msg(("Nightly Scheduler <%s>: skipping build. " +
                         "- Related builder <%s> failed last build")
                        % (self.name, ", ".join(self.relatedBuilderNames)))
            return defer.succeed(None)   

c['schedulers'] = []
c['schedulers'].append(SingleBranchScheduler(
                            name="Continuous",
                            change_filter=filter.ChangeFilter(branch='master'),
                            treeStableTimer=3*60,
                            builderNames=["Continuous"]))
c['schedulers'].append(DependentNightly(
                            name="Ninghtly",
                            branch='master',
                            hour=13,
                            minute=24,
                            onlyIfChanged=False,
                            builderNames=["Ninghtly"],
                            relatedBuilderNames=["Continuous"] ))

When the scheduler finds out that the build won’t run, a message will be log to twistd.log file as shown bellow.

I hope this series of blog posts were helpful to give you an overall idea about using buildbot for CI.

Thursday, August 23, 2012

Buildbot Part III: Customizations

On my previous post we configured buildbot to run automated build and tests. Now, let’s add customizations, usually this configurations consists of subclasses set up for use in a regular buildbot configuration file (master.cfg).

Display more information in the waterfall

The first customization will be adding detail information in the waterfall page. One way to do this is writing a new custom buildstep. Most shell commands emit messages to stdout/stderr as they run, we will get output information from xunit.console.exe and use the functions provided by buildbot to perform summarization of the step base on the content of the stdio logfile. Bellow changes done to display details about the tests executed in a dll: total, failed, skipped and the time it took to perform the tests.

class xUnitConsole(ShellCommand):
    def __init__(self, **kwargs):
        ShellCommand.__init__(self, **kwargs)

    def describe(self, done=False):
        description = ShellCommand.describe(self,done)
        if done:
            description.append('total: %d ' % self.step_status.getStatistic('total', 0))
            failed = self.step_status.getStatistic('failed', 0)
            if failed > 0:
                description.append('failed: %d' % failed)
            skipped = self.step_status.getStatistic('skipped', 0)
            if skipped > 0:
                description.append('skipped: %d' % skipped)
            description.append('took: %s seconds' % self.step_status.getStatistic('time', 0))                
            
        return description

    def createSummary(self,log):
        _re_result = re.compile(r'(\d*) total, (\d*) failed, (\d*) skipped, took (.*) seconds')
        
        total = 0
        failed = 0
        skipped = 0
        time = 0
        
        lines = self.getLog('stdio').readlines()
        
        for l in lines:
            m = _re_result.search(l)
            if m:                
                total = int(m.group(1))
                failed = int(m.group(2))
                skipped = int(m.group(3))
                time = m.group(4)
              
        self.step_status.setStatistic('total', total)
        self.step_status.setStatistic('failed', failed)
        self.step_status.setStatistic('skipped', skipped)
        self.step_status.setStatistic('time', time)

def RunTests(factory, testproject, configuration, testdirectory):
        factory.addStep(xUnitConsole(command=['src\\xunit.console\\bin\\%s\\xunit.console.exe' % configuration,
                                '%s\\%s' % (testdirectory, testproject), "/html", "%s\\testresults.html" % testdirectory],
                                     workdir='build\\', name="Run %s" % testproject,
                                     description="Running %s" % testproject ,
                                     descriptionDone="Run %s done" % testproject,
                                     flunkOnFailure=False, warnOnFailure=True))

We switched from using ShellCommand to use the new one xUnitConsole, bellow an image with the new information displayed in the waterfall page.

you can follow a similar approach to display more information in compiling buildstep, check the visual studio buildstep in vstudio.py.

xunit.net report

Next customization will be adding a link to the report generated by xunit.console. The ShellCommand has the parameter logfiles that is useful in this case, when a command logs information in a local file rather than emitting everything to stdout/stderr. By setting lazylogfiles=True, logfiles will be tracked lazily, meaning they will only be added when and if something is written to them. Finally, the addHtmllog adds a logfile containing pre-formatted HTML. Bellow the changes done in master.cfg.

class  xUnitConsole(ShellCommand):    

    def __init__(self, xunitreport=None, **kwargs):
        ShellCommand.__init__(self, **kwargs)

    def describe(self, done=False):
        description = ShellCommand.describe(self,done)
        
        if done:
            description.append('total: %d ' % self.step_status.getStatistic('total', 0))
            failed = self.step_status.getStatistic('failed', 0)
            if failed > 0:
                description.append('failed: %d' % failed)
            skipped = self.step_status.getStatistic('skipped', 0)
            if skipped > 0:
                description.append('skipped: %d' % skipped)
            description.append('took: %s seconds' % self.step_status.getStatistic('time', 0))                
            
        return description

    def createSummary(self,log):
        html = self.getLog("xunit-report")
        text = html.getText()
        self.addHTMLLog("xunit-report.html", text)
        
        _re_result = re.compile(r'(\d*) total, (\d*) failed, (\d*) skipped, took (.*) seconds')
        
        total = 0
        failed = 0
        skipped = 0
        time = 0
        
        lines = self.getLog('stdio').readlines()
        
        for l in lines:
            m = _re_result.search(l)
            if m:                
                total = int(m.group(1))
                failed = int(m.group(2))
                skipped = int(m.group(3))
                time = m.group(4)
              
        self.step_status.setStatistic('total', total)
        self.step_status.setStatistic('failed', failed)
        self.step_status.setStatistic('skipped', skipped)
        self.step_status.setStatistic('time', time)

def RunTests(factory, testproject, configuration, testdirectory):
        factory.addStep(xUnitConsole(command=['src\\xunit.console\\bin\\%s\\xunit.console.exe' % configuration,
                                '%s\\%s' % (testdirectory, testproject), "/html", "%s\\testresults.html" % testdirectory],
                                     workdir='build\\', name="Run %s" % testproject,
                                     description="Running %s" % testproject ,
                                     descriptionDone="Run %s done" % testproject,
                                     flunkOnFailure=False, warnOnFailure=True,
                                     logfiles = {"xunit-report" : "%s\\testresults.html" % testdirectory }, lazylogfiles=True))

The waterfall page now has a link to xunit.net report, as you can see bellow:

On my next post I’ll show you how to customize the scheduler.

Thursday, August 9, 2012

Buildbot Part II – Running automated build and tests

On my previous post we setup buildbot. Now, let’s configure it to run automated builds and tests. To do this, I’ll be using my xUnit.Net fork hosted on git/codeplex.
Make the following changes to master.cfg:

Connect to the buildslave

c['slaves'] = [BuildSlave("BuildSlave01", "mysecretpwd")]

Track repository – GitPoller

The gitpoller periodically fetches from a remote git repository and process any changes.

from buildbot.changes.gitpoller import GitPoller
c['change_source'] = []
c['change_source'].append(GitPoller(
        'https://git01.codeplex.com/forks/mariangemarcano/xunit',
        gitbin='C:\\Program Files (x86)\\Git\\bin\\git.exe',
        workdir='gitpoller_work',
        pollinterval=300))

Add scheduler

There are different types of schedulers supported by buildbot and you can customize it. Let’s add a single branch scheduler to run the build whenever a change is committed to the repository and knightly schedulers to run the build once a day.

from buildbot.schedulers.basic import SingleBranchScheduler
from buildbot.schedulers.timed import Nightly
from buildbot.changes import filter

c['schedulers'] = []
c['schedulers'].append(SingleBranchScheduler(
                            name="Continuous",
                            change_filter=filter.ChangeFilter(branch='master'),
                            treeStableTimer=3*60,
                            builderNames=["Continuous"]))
c['schedulers'].append(Nightly(
                            name="Ninghtly",
                            branch='master',
                            hour=3,
                            minute=30,
                            onlyIfChanged=True,
                            builderNames=["Ninghtly"]))

Define a Build Factory

It contains the steps that a build will follow. In our case, we need to get source from repository, compile xunit.net projects and run the tests.

Get source from repository: Let’s do an incremental update for continuous build and compile debug; the full build will do a full repository checkout and compile using release configuration. The git factory needs to access git binaries, set the environment path to where you install it for example: C:\Program Files (x86)\Git\bin\.

Compile xunit.net and run tests: Let’s compile using MSbuild.exe and run the tests with xunit.console.exe. By setting flunkOnFailure=False/warnOnFailure=True, test failures will mark the overall build as having warnings.

from buildbot.process.factory import BuildFactory
from buildbot.steps.source.git import Git
from buildbot.steps.shell import ShellCommand

def MSBuildFramework4(factory, configuration, projectdir):
    factory.addStep(ShellCommand(command=['%windir%\\Microsoft.NET\\Framework\\v4.0.30319\\MSbuild.exe',
                                          'xunit.msbuild', '/t:Build', '/p:Configuration=%s' % configuration],
                                 workdir=projectdir, name="Compile xunit.net projects",
                                 description="Compiling xunit.net projects",
                                 descriptionDone="Copile xunit.net projects done"))

def RunTests(factory, testproject, configuration, testdirectory):
        factory.addStep(ShellCommand(command=['src\\xunit.console\\bin\\%s\\xunit.console.exe' % configuration,
                                '%s\\%s' % (testdirectory, testproject), "/html", "%s\\testresults.html" % testdirectory],
                                     workdir='build\\', name="Run %s" % testproject,
                                     description="Running %s" % testproject ,
                                     descriptionDone="Run %s done" % testproject,
                                     flunkOnFailure=False, warnOnFailure=True))

factory_continuous = BuildFactory()
factory_continuous.addStep(Git(repourl='https://git01.codeplex.com/forks/mariangemarcano/xunit',  mode='incremental'))
MSBuildFramework4(factory_continuous, 'Debug', 'build\\')
RunTests(factory_continuous, 'test.xunit.dll', 'Debug', 'test\\test.xunit\\bin\\Debug')
RunTests(factory_continuous, 'test.xunit.gui.dll', 'Debug', 'test\\test.xunit.gui\\bin\\Debug')

factory_ninghtly = BuildFactory()
factory_ninghtly.addStep(Git(repourl='https://git01.codeplex.com/forks/mariangemarcano/xunit', mode='full', method='clobber'))
MSBuildFramework4(factory_ninghtly, 'Release', 'build\\')
RunTests(factory_ninghtly, 'test.xunit.dll', 'Release', 'test\\test.xunit\\bin\\Release')
RunTests(factory_ninghtly, 'test.xunit.gui.dll', 'Release', 'test\\test.xunit.gui\\bin\\Release')

Builder Configuration

from buildbot.config import BuilderConfig

c['builders'] = []
c['builders'].append(
    BuilderConfig(name="Continuous",
      slavenames=["BuildSlave01"],
      factory=factory_continuous))
c['builders'].append(
    BuilderConfig(name="Ninghtly",
      slavenames=["BuildSlave01"],
      factory=factory_ninghtly))

Status notifications

You could also add MailNotifier to configure mail notifications.

c['status'] = []

from buildbot.status import html
from buildbot.status.web import authz, auth

authz_cfg=authz.Authz(
    # change any of these to True to enable; see the manual for more
    # options
    gracefulShutdown = False,
    forceBuild = True, # use this to test your slave once it is set up
    forceAllBuilds = False,
    pingBuilder = True,
    stopBuild = True,
    stopAllBuilds = False,
    cancelPendingBuild = True,
)
c['status'].append(html.WebStatus(http_port=8010, authz=authz_cfg))

Every commit done will trigger a continuous build that automatically compiles and runs tests. In the waterfall page you can see the people who has committed changes to the branch (view image bellow).

Click on the person’s name to view commit’s detail information like repository, branch and revision. At the bottom there is a list of changed files. This is very helpful to track down any failures (view image bellow).

On my next blog post I’ll show you how to add customizations.